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FOREWORD 


Science  and  Technology  Corporation  (STC)  is  pleased  to  submit  this  interim  report  entitled 
"Regional-scale  Analysis  and  Forecasting  (RAP)  Report"  as  part  of  Contract  No.  F19628-89-C-0167. 
The  long-term  objective  is  to  develop  and  demonstrate  a  regional  analysis  procedure  (RAP)  using 
optimum  interpolation  (01)  in  both  real  and  simulation  modes.  The  RAP  will  assimilate  all  available 
meteorological  data  and  fuse  it  efficiently  into  a  high  resolution  analysis  of  mass,  motion,  and 
moisture  fields.  This  report  describes  the  five  major  tasks  to  be  accomplished,  the  present  status  of 
those  tasks,  the  optimum  interpolation  scheme,  the  development  of  the  RAP  databases,  case  studies 
of  experiments  in  numerical  analysis,  and  plans  for  completing  the  major  tasks.  The  valuable 
technical  assistance  provided  by  Donald  Norquist,  Contract  Monitor,  is  acknowledged  and  greatly 
appreciated. 


1.  INTRODUCTION 


The  foundation  of  the  relocatable  regional  analysis  procedure  (RAP)  has  been  solidly 
established.  RAP  is  a  multivariate,  multisource  analysis  scheme  for  relocatable,  regional-scale 
applications.  Specifically,  the  scheme  incorporates  optimum  interpolation  for  the  numerical  analysis 
and  stepwise  regression  for  selection  of  observations. 

The  procedure  uses  meteorological  fields  from  various  sources  on  a  regional  scale  analysis 
grid,  called  uniform  gridded  data  fields.  Observations  of  varying  availability  in  time  and  space  arc 
used  in  an  optimum  sense  to  produce  the  best  possible  depiction  of  the  variables  within  the  regional 
atmospheric  volume. 

In  addition,  observing  system  simulation  experiments  will  be  conducted  to  establish 
confidence  levels  for  the  regional  analyses.  Finally,  various  short  term  (out  to  12  hr)  regional 
forecast  methods  (such  as  persistence,  numerical  or  some  combination)  will  be  examined  for  the 
purpose  of  recommending  an  optimal  forecast  procedure. 

A  brief  description  of  present  RAP  objectives  is  provided  below.  The  remainder  of  the 
report  describes  the  five  major  tasks  to  be  accomplished,  the  present  status  of  those  tasks,  the 
optimum  interpolation  scheme,  the  development  of  the  RAP  databases,  case  studies  of  experiments 
in  numerical  analysis,  and  plans  for  completing  the  major  tasks. 

The  long-term  objective  of  the  RAP  is  to  develop  and  demonstrate  a  regional  analysis 
procedure  using  optimum  interpolation  (OI)  in  both  real  and  simulation  modes.  The  RAP  will 
assimilate  all  available  meteorological  data  and  fuse  them  efficiently  into  a  high  resolution  analysis 
of  mass,  motion,  and  moisture  fields.  To  accomplish  the  long-term  objective,  several  short-term 
objectives  are  presently  being  pursued: 

1.  Restructuring  the  observations  database  so  that  forecast  errors  can  be  calculated 
efficiently 
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2. 


Performing  numerical  analysis  experiments  -,/ith  the  observation  and  forecast 
database  to  test  the  multivariate  optimum  interpolation  scheme  in  all  analysis 
variables,  ensuring  that  the  resulting  analyses  are  dynamically  consistent 


3.  Completing  the  calculation  of  error  statistics  (differences  Detween  the  Relocatable 
Window  Forecast  Model  [RWFM]  forecasts  and  observations)  and  developing  error 
correlation  functions 

4.  Continuing  to  integrate  into  the  RAP  the  latest  techniques  obtained  from  the 
literature  on  developing  operational  optimum  interpolation  schemes  and  specifying 
the  errors  of  present  and  future  observing  systems 

5.  Completing  the  extraction  of  the  First  GARP  (Global  Atmospheric  Research 
Program)  Global  Experiment  (FGGE)-2b  observations  and  T-106  nature  run 
forecasts  from  magnetic  tapes  from  the  European  Center  for  Medium  Range 
Weather  Forecasts  (ECMWF),  and  unpacking  the  required  gridded  forecast  fields 
that  can  be  interpolated  to  the  observation  points  to  simulate  an  observation  network 
for  use  in  observing  systems  simulation  experiments  (OSSEs) 

6.  Generating  RWFM  forecasts  from  the  nature  run 

7.  Testing  the  RWFM  verification  model  (RWFMVER) 


2.  RAP  TASKS 

The  research  and  development  effort,  which  requires  a  great  deal  of  software  modification 
and  development,  is  specified  in  the  Science  and  Technology  Corporation  (STC)  Technical 
Report  3072,  RAP  Initial  Work  Plan,  September  1989.  Each  of  the  five  dependent  tasks  and 
a  brief  summary  of  the  progress  toward  their  completion  are  described.  STC  Task  4,  which  was 
begun  most  recently,  is  described  in  considerable  detail.  Appendix  A  is  a  report  on  the  software 
modification  and  development  required  for  RAP. 
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2.1  STC  Task  1 


Task  1  is  to  develop  an  automated  relocatable,  regional  multivariate  objective  analysis 
procedure  using  optimum  interpolation  (OI).  There  are  two  major  subtasks:  (1)  design  and  develop 
an  OI  scheme,  and  (2)  incorporate  the  RAP  analysis  algorithm  into  the  relocatability  and  variable 
resolution  framework  of  the  Relocatable  Window  Analysis  Model  (RWAM). 

The  OI  scheme,  based  on  objective  data  selection  and  buddy  checking  by  forward  stepwise 
regression  (FSR),  has  now  been  carefully  tested  on  all  meteorological  variables,  specifically,  surface 
pressure,  height  of  isobaric  levels,  temperature,  humidity,  and  u-wind  and  v-wind  components. 

The  RWAM  is  presently  in  a  nonoperational  status  at  the  Air  Force  Global  Weather  Central 
(AFGWC),  which  has  not  delivered  the  RWAM  code  (and  probably  will  not  in  the  near  term); 
therefore,  work  was  focused  on  completing  the  testing  of  the  OI  scheme. 

2.2  STC  TASK  2 

Task  2  also  has  two  subtasks.  The  primary  subtask  is  to  compute  and  model  the  first-guess 
forecast  errors,  from  which  correlation  functions  can  be  developed  to  model  these  errors.  The 
secondary  subtask  is  to  determine  and  model  observation  errors,  if  these  are  needed  by  the  OI 
scheme. 

The  primary  task  has  dominated  much  of  the  effort  so  far,  requiring  first  the  development 
of  extensive  analysis,  forecast,  and  observations  databases,  which  are  ready  for  use.  The  module 
that  calculates  error  correlations  is  based  on  Hollingsworth  and  Lonnberg  (1986),  Lonnberg  and 
Hollingsworth  (1986),  and  Thiebaux  et  al.  (1986).  The  design  (Section  4.5)  specifics  that  the 
correlation  pairs  will  be  placed  into  bins,  which  are  a  function  of  the  distance  between  the  pairs,  and 
averaged. 

As  shown  in  Section  3.1,  the  theory  of  the  OI  scheme  leads  to  the  conclusion  that  the 
observation  errors  are  of  secondary  importance.  The  small  errors  in  the  case  studies  of  numerical 
analyses,  discussed  in  Section  5,  indicated  that  the  conclusion  is  true.  Nevertheless,  this  assertion 
remains  to  be  proved. 
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2.3  STC  TASK  3 


Task  3  is  to  develop  a  relocatable  verification  package,  using  standard  measures  of  error  such 
as  the  RWFM  verification  model  (RWFMVER),  the  Phillips  Laboratory  (PL)  Global  Spectral  Model 
(GSM)  diagnostic  package,  and  map  comparisons.  The  package  is  for  testing  RAP  formally  in  real 
data  experiments.  This  testing  will  include  comparing  RAP  against  RWAM,  the  AFGWC 
High-Resolution  Analysis  Model  (HIRAS),  and  real  observations,  and  with  forecasts  made  by  the 
AFGWC  Relocatable  Window  Forecast  Model  and  the  PL  GSM. 

The  forecast  comparisons  have  been  completed,  and  RWFMVER  is  presently  being 
converted  to  execute  on  the  RAP  automated  data  processing  system.  Under  this  task  RAP  will  be 
run  on  scenarios  not  included  in  the  original  database,  and  RAP  will  be  checked  against  other 
analyses  and  observations.  Also,  the  verification  package  can  be  used  to  improve  RAP’s 
methodology. 

2.4  STC  TASK  4 

The  observing  systems  simulation  experiments  (OSSEs)  can  proceed  formally  only  after 
demonstrated  success  of  Tasks  1  and  2  in  Task  3.  The  OSSEs  consist  of  several  subtasks,  all  of 
which  will  require  2  years  to  complete.  They  are  described  in  detail  because  they  will  be  the  main 
focus  of  effort  for  the  remainder  of  this  project. 

1.  The  first  subtask,  to  build  a  database  of  simulated  FGGE-2b  observations  from  the 
ECMWF  T-106  nature  run  tapes,  is  partially  completed.  The  ECMWF  GRIB 
software  was  used  to  extract  selected  forecast  files  from  the  tapes  at  the  Phillips 
Laboratory/Geophysics  Directorate  (PL/GP)  Computer  Center.  In  addition,  the 
FGGE-2b  files  have  also  been  extracted.  All  required  files  (forecasts  and  upper  air 
observations  at  6-hr  intervals  from  19  January  1979  through  1  February  1979)  have 
been  transferred  to  the  CRAY-2  at  the  Phillips  Lab  Supercomputer  Center  (PLSC), 
where  the  needed  fields  will  be  unpacked  and  stored  on  a  lxl  degree  grid  at 
mandatory  levels.  "Perfect"  observation  files  at  FGGE-2b  data  locations  will  be 
simulated  by  interpolating  the  forecasts  from  the  grid  to  the  observation  points  to 
replace  data  actually  observed. 
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The  following  detailed  description  of  the  plans  for  the  OSSEs  is  in  three  parts  that 
follow  one  after  the  other  in  natural  sequence:  the  development  of  datasets  of 
"perfect  observations,"  the  development  of  an  analysis  and  forecast  database,  and  the 
preparation  of  RAP  analyses  under  different  scenarios. 

The  next  step  is  to  simulate  an  observation  network  of  existing  and  proposed 
sensors/systems  by  interpolating  the  gridded  T-106  forecast  data  to  FGGE-2b 
observation  points  to  mimic  "real"  data  tapes.  By  assigning  errors  (random  and 
systematic  instrument  errors)  to  these  "observations"  extracted  from  the  nature  run, 
we  can  create  files  of  simulated  observations. 

The  Air  Force  Geophysics  Laboratory  (AFGL)  Statistical  Analysis  Package  (ASAP), 
which  takes  a  lxl  gridded  first  guess  field  and  interpolates  it  to  observation  points, 
has  been  modified  to  accept  T-106  gridded  data.  Perfect  FGGE-2b  files  will  be  built 
by  treating  each  gridded  nature  run  field  as  a  first  guess,  modifying  the  MASTORX 
subroutines  to  read  the  FGGE-2b  observations  from  the  files,  and  writing  an  output 
file  with  the  corresponding  nature  run  values  at  each  site. 

Software  is  being  developed  to  generate  "perfect  observations"  by  ingesting  and  time- 
interpolating  the  T-106  "nature  run”  ontime  forecasts  (at  00,  06,  12,  and  18  UTC)  to 
FGGE-2b  observation  points,  including  satellite  data  and  AIREPS.  A  second  set  of 
data,  developed  by  introducing  desired  observational  (random  and  correlated)  errors, 
will  be  used  to  act  as  the  actual  simulated  FGGE-2b  observations  at  6-hr  intervals. 

2.  The  second  major  subtask  is  to  generate  an  analysis  and  forecast  database  from  the 
simulated  FGGE-2b  observations.  The  database  will  make  up  the  CONTROL 
scenario,  which  is  based  on  the  existing  observation  system,  for  the  RAP  OSSEs.  The 
procedure  is  described  below  and  depicted  in  Fig.  1  on  pages  6  and  7. 

First,  3-day  global  "spin-up"  atmospheres  arc  generated  by  running  GSM  forecasts 
for  72  hr,  initialized  at  00  UTC  on  16,  21,  and  26  January  1979  with  the  nature  run 
from  the  ECMWF  to  ensure  there  are  ample  differences  between  the  nature  run  and 
analyses.  Next,  ASAP  begins  with  each  of  the  72  hr  GSM  forecasts  and  produces  the 
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Nature  Day 
1979 


Input/Output  Required  for  the  OSSEs 


16  Jan 

17 

18 
19 


OOUTC 

19 


OOUTC 

20 


OOUTC 

21 


OOUTC 

21 

OOUTC 

22 

12  UTC 

OOUTC 

23 


OOUTC 

24 


00  UTC 
25 


00  UTC 
26 


Figure 


SI 


Run  the  GSM  for  72  hr  to  generate  a  "spin-up" 
atmosphere  (SI).  Use  the  T-106  forecast  at  00  h. 


I  SI  is  a  first  guess  for  the  AFGL  Statistical 

End  Si  Analysis  Program  (ASAP). 


GDAlx 


End  GDAlx 


Two  48-hr  global  data  assimilations  (GDAs)  are 
required  at  6-hr  intervals.  ASAP  with  all 
observations  =  GDA11;  withholding  conventional 
observations  =  GDA12.  Thus,  the  GDAs  from  the 
GSM  and  ASAP  are  scenario  dependent.  The 
observations  in  all  cases  are  simulated  (by 
interpolating  the  T-106  forecasts  to  FGGE-2b 
observation  points  and  adding  observation  errors) 
FGGE-2b  observations. 


S2 


G11/G12 

R11/R12 


End  of 
36-hr 
forecast 


Run  a  72-hr  GSM  spin-up  forecast  (S2)  starting 
at  21/00  UTC  (T-106  h  =  120).  Run  36-hr  GSMs 
(Glx)  and  RWFMs  (Rlx),  using  the  48th  hr  of 
GDAs  as  initial  conditions;  output  from  RWFM  is 
Rll  and  R12  (x=l,  all  observations;  x=2,  some 
observations  withheld).  Store  the  12-,  18-,  24-,  and 
the  36-hr  forecasts,  which  will  be  the  RAP  first 
guesses. 


End 

S2 


GDA2x 


Use  S2  to  initialize  two  48-hr  GDAs  run  at  6-hr 
intervals.  This  run  of  the  GSM  and  ASAP  with  all 
obsverations  =  GDA21;  ASAP  without  the 
conventional  observations  =  GDA22. 


Simulated  FGGE-2b  observations  (T-106 
forecasts  interpolated  to  FGGE-2b  observation 
points  and  distorted  by  adding  observation  errors) 
End  GDA2x  are  used. 


1.  A  depiction  of  the  process  for  generating  the  OSSE  databases. 
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00  UTC  S3 
26 


G1/G2 

R21/R22 


00  UTC 

27 

12  UTC 

00  UTC 

28 


End  of 
36-hr 
forecast 


Run  the  third  72-hr  GSM  spin-up  forecast  (S3), 
starting  at  26/00  UTC,  that  is,  T-106  forecast  h=240. 
Start  the  2nd  run  36-hr  GSMs  (G2x)  and  RWFMs 
(R2x),  using  the  48th  hr  of  GDAs  as  initial 
conditions;  output  from  RWFM  is  R21  and  R22, 
which  will  be  1st  guesses  for  RAP. 

Store  the  12-,  18-,  24-,  and  36-hr  forecasts. 

Recall  that  x=l  implies  that  all  the  FGGE-2b 
simulated  observations  will  be  used;  x=2  implies  that 
conventional  observation  points  will  be  withheld. 


00  UTC  I 
29  End 

S2 


GDA3x 


00  UTC 

30 

00  UTC 

31  End  GDA3x 


Use  S3  to  initialize  the  third  (and  final)  set  of 
two  48-hr  GDAs  at  29/00  UTC  and  run  at  6-hr 
intervals.  Define  the  GSM/ASAP  with  all 
observations  =  GDA31;  the  ASAP  without  the 
conventional  observations  =  GDA32. 

The  observations  are,  of  course,  (the  T-106 
forecasts  interpolated  to  FGGE-2b  observation 
points,  where  the  observation  errors  are  added)  the 
simulated  FGGE-2b  observations. 


00  UTC 
31  Jan 

00  UTC 
1  Feb 

12  UTC 


G31/G32 

R21/R22 


Start  the  third  (and  final)  36-hr  GSM  forecasts 
(G3x)  and  RWFMs  (R3x),  where  x=l  for  all 
observations  and  x=2  for  withheld  observations.  Use 
the  48th  hr  of  GDA3x  as  initial  conditions;  output 
from  RWFMs  is  R31  and  R32,  which  will  be  first 
guesses  for  RAP. 


End  of 

36-hr  Store  the  12-,  18-,  24-,  and  36-hr  forecasts  for 

forecast  RAP  first  guesses. 


Figure  1.  (Concluded)  A  depiction  of  the  process  for  generating  the  OSSE 
databases. 


GSM  forecasts,  one  with  conventional  (that  is,  RAOB,  PIBAL,  and  AIREPS) 
observations  and  one  without  conventional  observations  in  a  hostile  zone  (that  is, 
using  only  satellite  data  in  that  specific  volume). 
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RWFM  forecasts  (which  begin  at  00  UCT  on  21,  26,  and  31  January  1979  of  the 
nature  run)  will  be  generated  from  the  GDAs.  The  GDAs  arc  input  to  the  GSM,  2- 
day  global  data  assimilations  (GDAs)  using  the  globally  simulated  FGGE-2b 
observations  at  6-hr  cycles.  There  will  be  two  GDAs  for  each  of  the  three  72- 
hrwhich  will  be  run  for  36  hr  to  provide  6-hr  forecast  fields.  These  fields  are  the 
initial  and  boundary  conditions  for  36-hr  runs  of  the  RWFM  for  the  Eurasian  and  the 
Central  American  regions. 

RWFMVER  will  be  used  to  compare  RWFM  forecasts  (filtered  to  the  nature  grid) 
with  the  nature  run.  The  purpose  is  to  determine  the  RWFM  forecast  errors  as  a 
function  of  region  and  forecast  length,  and  compare  simulation  errors  with  actual 
forecast  errors  (from  Task  2)  to  assess  the  realism  of  the  OSSEs. 

3.  The  final  step  is  to  prepare  several  RAP  analyses  from  the  simulated  databases  and 
access  the  value  of  the  experiments.  With  RWFM  forecasts  valid  at  12,  18,  24, 
and  36  hr,  and  12-hr-  and  24-hr-old  RAP  analyses  as  the  first  guess,  RAP  will 
generate  analyses  using  the  requested  scenarios  of  simulated  observations  (Schaaf, 
1990),  that  is, 

•  CONTROL:  existing  observing  system 

•  ALLOBS:  existing  and  proposed  observations  systems 

•  TACOBS:  ALLOBS  except  TOVS  and  conventional  observations  in 

hostile  side  of  battle  area,  but  in  "HIRAS"  exclude  only  the 

conventional  observations 

•  OLDOBS:  (1)  12-hr  RWFM  forecast  valid  at  time  t  and  offtime 

observations  at  t-6  hr 

(2)  same  as  (1)  but  with  24-hr  RWFM  forecast 

(3)  18-hr  RWFM  forecast  at  time  t+6  and  ontime 
observations  at  time  t 

RWFMVER  will  be  used  to  compare  (grid-to-station  and  grid-to-grid)  RAP  analyses, 
which  are  filtered  to  the  nature  grid  ("truth”  for  the  OSSEs),  with  the  Nature  Run. 
Comparisons  are  required  with  RWFM,  RWAM  (if  available),  and  GSM  for  each 
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region  at  the  above  time  periods  and  data  denial  scenarios.  Also,  analysis  errors 
(computed  as  a  by-product  of  OI)  will  be  compared  with  RAP-nature  differences  to 
determine  the  real  forecast  error  statistics  generated  for  the  error-lev  ;1  module  in 
Task  2,  and  they  will  be  tabulated  by  region,  season,  first  guess  scenario,  and  data 
denial  scenario. 

2.5  STC  TASK  5 

Survey  the  literature  for  candidate  forecast  techniques  to  extend  the  RAP  analysis  out  to 
12  hr.  The  obvious  candidate,  persistence,  may  not  be  a  good  12-hr  forecast  for  small  regions 
because  of  advection.  STC  will  consider  both  dynamical  and  statistical  techniques;  for  example,  the 
known  difference  field  between  the  RAP  and  the  RWFM  00-hr  forecast  could  be  used  to  modify  the 
RWFM  12-hr  forecast.  STC  will  document  the  advantages  and  disadvantages  of  techniques  for  use 
in  the  field  and  recommend  a  candidate  for  further  study. 


3.  THEORETICAL  AND  PRACTICAL  CONSIDERATIONS  OF  OPTIMUM  INTERPOLATION 

Optimum  interpolation,  also  known  as  statistical  interpolation,  was  selected  as  the  analysis 
scheme  for  the  RAP.  The  following  subsections  discuss  the  details  of  the  scheme  and  the  rationale 
for  our  technical  approach,  in  particular,  the  methodology  for  rigorously  selecting  observations  and 
the  modeling  of  correlation  structure  functions. 

3.1  SENSITIVITY  OF  INTERPOLATION  WEIGHTS  TO  CORRELATION  FUNCTIONS 

Given  a  gridpoint  (g)  surrounded  by  a  number  of  observation  points,  i,  (i  =1,2,...,  n),  the 
process  of  optimum  interpolation  determines  the  relative  weights  (W;)  assigned  to  each  observation 
in  the  expression, 

f9  *  fl  ♦  (1> 

where  ft  is  the  estimated  (or  analyzed)  value  of  the  variable  at  the  gridpoint  g,  fj  is  a  preliminary 
(first  guess)  value  of  the  variable  at  g,  and  f? and  f fare  respectively,  observed  and  first  guess 
values  of  the  variable  at  the  observation  points.  In  Eq.  1,N'<  N,  implies  that  it  may  be  desirable 
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to  use  fewer  than  all  of  the  available  observations.  Since,  in  general,  first  guess  values  are  obtained 
from  a  prior  prediction  of  the  field  quantity,  (f),  O’  is  generally  available  only  at  giidpoints  of  the 
prediction  model,  that  is  fj.  Therefore,  the  fl*  are  obtained  by  a  suitable  interpolation  process 
from  the  fr 


Keegan  and  Shapiro  (1985)  showed  that  the  consistent,  as  well  as  the  simplest,  system  of 
equations  to  use  in  obtaining  the  weights  W;  in  Eq.  1,  is  their  Eqs.  2-16.  That  is, 


(Ai#  Aj)  o4io4j  »  r  (A^»  As)  oAjo4^  ♦  r(b.},e.°g)  oAok* 


(2) 


where  j  =  1, 2, . . .,  N'  also  indicates  an  observation  point,  r(a,  b)  is  the  linear  correlation  coefficient 
between  the  parameters  a  and  b,  ak  is  the  standard  deviation  of  a,  A,  =  fp,  -  f°;  and  ej  =  fj  -  fg 
indicates  the  deviation  of  the  observed  value  of  the  variable  at  the  gridpoint  (fj)  from  the  true 
value  (fj.  In  other  words,  e  J  is  the  observation  error  at  the  gridpoint  and  is  an  unknown  quantity 
since  the  true  value  of  the  variable  f  is  unknown  everywhere  (both  at  gridpoints  and  observation 
points)  and  the  observed  value  fj  must  be  obtained  by  interpolation  from  f*  Since  f  J  in  Eq.  2 
enters  as  a  correlation  with  Aj,  it  is  likely  that  r(Aj,<-j)  <  <  r(Ai,Af).  Therefore,  neglect  r(Aj,e£),  and 
the  right-hand  side  of  Eq.  2  contains  only  the  first  term  r(Aj,At)a^.  (The  significance  of  this 
assumption  will  be  tested  in  the  simulation  mode  by  assigning  a  series  of  nonzero  values  to  the 
neglected  correlation  and  re-evaluating  the  W*.)  Note  that  Af  =  f  J  -  fj  implies  that  an  estimate 
of  the  observed  value  at  the  gridpoint  must  be  obtained  by  a  suitable  interpolation  process  from 
the  f°r 


The  basic  problem  in  the  solution  of  the  simplified  form  of  Eq.  2  is  the  specification  of  the 
A„  A,,  and  Ag  statistics,  namely,  r(Ab  A,),  r(A„  AJ,  and  o„  ar  and  a%.  If  accurate  estimates  of  these 
quantities  were  available,  the  application  of  OI  would  be  a  simple  matter.  Unfortunately,  the 
relevant  statistics  are  essentially  unknown.  While  some  limited  studies  have  been  made  on  r(f°.  f°). 
the  required  correlation  fields  of  r(A„A,)  arc  model  dependent:  therefore,  because  of  the  rapid  rate 
of  alterations  in  operational  models  there  has  been  no  possibility  of  developing  long  series  of  both 
predicted  and  observed  quantities  from  which  the  required  statistics  can  be  obtained. 

In  spite  of  this  shortage  of  data,  a  substantial  literature  (e  g.,  Dey  and  Moronc,  19S5;  and 
DiMego,  1988)  exists  on  the  advantages  and  disadvantages  of  various  models  of  the  required 
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correlation  functions.  Without  a  firmer  foundation  in  the  facts  of  observations,  however,  it  is  not 
feasible  now  to  attempt  to  choose  the  best  from  among  the  various  correlation  models.  STC  Task 
2  includes  preparing  a  small  sample  of  observational  data,  which  matches  forecasts  from  an 
operational  model,  and  calculating  suitable  correlation  models  and  standard  deviation  statistics. 
Nevertheless,  it  was  useful  first  to  test  the  sensitivity  of  the  W(  derived  from  Eq.  2  to  variations  in 
the  correlation  functions. 

Expanding  Eq.  2,  neglecting  r(A,,  e  J),  and  assuming  the  data  have  been  normalized  so  that 
—  o&  =1,  for  one  gridpoint  g  and  n  observation  points  the  following  linear  equations 
describe  the  system. 


W,ru  +  W2r2,  +  .  .  .  +  W.r,,  =  rgl 

W,r,2  +  W2r  2  +  .  .  .  +  W,r,2  =  r^ 

W,ru  +  W2r2,  +  .  .  .  +  W.r..  =  rp  (3) 

where,  for  example,  r12  =  r(A,,  A2)  and  rgl  =  r(A,,  Aj). 

It  is  apparent  from  Eq.  3  that  the  Wj  arc  the  coefficients  of  the  multiple  regression  of  A,  on 
all  the  A,,  Aj.  If  the  various  correlations  in  Eq.  3  were  known  precisely,  the  total  normalized 
information  content  in  the  n  observations  with  regard  to  their  ability  to  estimate  the  analyzed  value 
of  the  variable  at  the  gridpoint  would  be  given  by  the  square  of  the  multiple  correlation 
coefficient.  For  example,  if  n  =  2  and  rg,  =  0.9,  r^  =  0.8,  and  r)2  =  0.7,  then  the  multiple 
correlation  coefficient  (rgl 2)  is  0.9309,  W,  =  2/3,  W2  =  1/3,  and  the  normalized  explained  variance 
is  0.8667.  This  particular  set  of  correlations  implies  that  observation  point  1  is  close  to  gridpoint  g 
and  that  observation  point  2  is  somewhat  more  distant  from  gridpoint  g,  but  closer  to  gridpoint  g 
than  point  1.  On  the  other  hand,  with  the  same  distribution  of  observation  points  with  respect  to 
the  gridpoint,  but  with  substantially  different  correlations  (rg,  =  0.7,  r^  =  0.6,  r12  =  0.5),  the  weights 
are  not  very  different.  In  this  case  W,  =  0.5333  instead  of  0.6667,  but  W2  is  still  1/3.  Also,  the 
multiple  correlation  and  explained  variance  are  substantially  different,  namely  0.7572  and  0.5733, 
respectively.  While  it  is  desirable  to  have  multiple  correlation  coefficients  that  arc  near  unity. 
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implying  small  mean  square  error,  in  any  one  realization  most  it  not  the  only  significance  is  contained 
in  the  weights.  Precise  modeling  of  the  various  correlation  functions  is  probably  less  important  then 
having  simple,  but  self-consistent  modeling  functions. 

Tests  of  the  sensitivity  of  the  W*  first  in  a  domain  with  two  observation  points  and 
subsequently  with  increased  numbers  of  observation  points  suggest  that  the  above  hypothesis  is  true. 
Assigning  correlations  as  in  the  preceding  examples  and  covering  a  complete  range  of  all  reasonable 
correlations  offers  a  study  that  has  the  advantage  of  not  being  dependent  on  the  availability  of  real 
data  (observations  and  model  output).  Such  studies  not  only  provide  more  definitive  answers  to 
questions  on  the  sensitivity  of  the  W,  to  the  values  of  the  respective  correlations,  but  they  also  clarify 
the  question  of  whether  to  limit  the  number  of  observations  or  to  use  all  available  observations.  For 
example,  if  there  are  two  observations  close  to  each  other  but  somewhat  distant  from  the  gridpoint, 
then  rgl  =  r^  =  0.7  and  rl2  =  0.9  are  reasonable  results.  In  this  case,  the  multiple  correlation 
(rtj2  =  0.7182)  is  not  much  larger  than  either  bivariate  correlation,  and  the  weights  are 
W,  =  W2  =  0.3684.  Either  observation  by  itself  would  contain  almost  the  same  information  as  both 
together,  but  the  use  of  both  has  the  advantage  of  partially  suppressing  random  errors  of 
observation. 

Table  1  illustrates  the  type  of  information  that  can  be  obtained  from  the  sensitivity  analysis, 
even  though  this  partial  example  contains  only  two  observation  points.  With  r|lt  and  r^  as  given  (0.5 
and  0.3,  respectively),  when  ru  is  small  (such  as  when  1  and  2  are  on  opposite  sides  of  g),  say  for 
example,  0,  0.1,  and  0.2,  the  weights  are  nearly  0.5  and  0.3,  respectively.  When  rI2  is  comparable  to 
rgl  and  r^  (say  0.3  to  0.7),  W2  is  near  zero  and  W,  is  near  0.5.  When  r,2  is  large(>.  .08),  W2  is  more 
and  more  negative  as  r12  =>  1.0,  and  W,  is  more  and  more  positive;  but  the  sum  of  the  weights 
W,  +  W2  <  0.45.  With  r|t  and  r^  as  given  in  Table  I,  it  is  not  possible  for  rJ2  to  be  larger  than  0.97; 
otherwise  rgl,  and  r^  could  not  differ  as  much  as  they  do.  With  r12  >_  0.98,  the  multiple  correlation 
would  be  greater  than  1,  and  the  solution  matrix  from  Eq.  3  would  be  degenerate  (ill-conditioned). 

3.2  FINAL  SELECTION  OF  OBSERVATIONS  IN  OPTIMUM  INTERPOLATION 

On  the  basis  of  some  preliminary  studies  including  26  randomly  distributed  artificial 
observations  within  a  regional  area  as  well  as  follow-on  studies  involving  densely  distributed  real  data 
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Table  1.  Example  of  a  sensitivity  analysis,  where  r|t  is  the  coefficient  of  correlation  between 
gridpoint  g  and  observation  point  1,  rfrl  2is  the  multiple  correlation  coefficient,  and  W,  and 
W2  are  the  relative  weights  assigned  to  observations  at  points  1  and  2,  respectively. 

rn 

V’ 

r,  2 

r*u 

w, 

W2 

0.5 

0.3 

0 

0.583 

0.5000 

0.3000 

0.5 

0.3 

0.1 

0.560 

0.4747 

0.2525 

0.5 

0.3 

0.2 

0.540 

0.4583 

0.2083 

0.5 

0.3 

0.3 

0.524 

0.4505 

0.1648 

0.5 

0.3 

0.4 

0.512 

0.4524 

0.1190 

0.5 

0.3 

0.5 

0.503 

0.4667 

0.0667 

0.5 

0.3 

0.6 

0.500 

0.5000 

0.0000 

0.5 

0.3 

0.7 

0.505 

0.5686 

-0.0980 

0.5 

0.3 

0.8 

0.527 

0.7222 

-0.2778 

0.5 

0.3 

0.9 

0.607 

1.2105 

-0.7895 

0.5 

0.3 

0.95 

0.751 

2.2051 

-1.7949 

0.5 

0.3 

0.97 

0.911 

3.5364 

-3.1303 

in  a  European  region,  the  decision  was  made  to  base  the  observation  selection  procedure  on  stepwise 
linear  regression.  This  section  discusses  the  results  of  these  studies  and  presents  a  conceptual 
overview  of  the  nature  of  stepwise  regression.  To  set  the  stage  some  relevant  background  is 
provided  in  the  form  of  several  slightly  edited  pages  from  Keegan  and  Shapiro  (1985),  a  report 
prepared  under  contract  with  the  U.S.  Air  Force  Geophysics  Laboratory  (AFGL),  which  is  now 
Phillips  Laboratory. 


The  selection  of  the  relevant  observations  is  a  complex  problem.  An  analysis  volume 
must  first  be  specified.  This  is  a  time-space  volume  containing  the  interpolation  point 
and  all  relevant  observations.  Since  correlations  between  observations  generally 
decrease  with  increasing  space  (distance  or  time),  the  analysis  volume  will  probably 
have  a  radius  approximating  the  distance  corresponding  to  zero  correlation  between 
the  relevant  parameters,  although  this  need  not  be  strictly  true.  This  volume, 
however,  may  incorporate  many  potential  observations  of  different  types,  different 
independent,  and  different  relevancies.  The  selection  system  must  be  able  to  assign 
weights  to  the  relevant  observations  consistent  with  their  independent  information 
content.  In  essence,  the  problem  of  selection  of  observations  may  be  considered  to 
be  a  multiple  linear  regression  problem.  Equation  2  shows  that  the  correlation 
between  each  pair  of  independent  variables  as  well  as  between  the  dependent 
variable  and  each  of  the  independent  variables  may  influence  the  Wj.  However, 
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while  some  of  ihe  operational  QI  systems  resemble  a  multiple  linear  regression 
approach,  all  depart  item  such  an  approach  to  seme  exterr.. 

With  regard  to  the  selection  of  observations,  it  would  be  desirable  to  eliminate,  as  far 
as  possible,  the  arbitrariness  in  the  OI  procedure.  If  too  many  observations  are 
allowed  to  influence  the  value  of  the  analyzed  variable,  not  only  do  the  computations 
become  laborious,  but  the  analysis  error  may  actually  increase.  The  independent 
information  contributed  by  each  observation  generally  decreases  in  proportion  to  the 
number  of  observations,  as  a  result  of  intercorrelation  among  the  independent 
variables.  (This  is  the  well-known  problem  of  multicolinearity  in  linear  regression, 
leading  to  ill-conditioned  matrices  with  small  determinants.)  On  the  other  hand,  if 
because  of  arbitrary  selection  rules,  too  few  observations  are  selected,  the  analysis 
will  be  far  from  optimum. 

A  rational  basis  for  selecting  the  observations  that  will  be  allowed  to  influence  the 
analyzed  value  is  required.  The  first  step  in  this  selection  process  is  rather  simple 
since  it  involves  the  establishment  of  a  two-,  three-,  or  four-dimensional  influence 
space,  where  any  observation  has  a  possibility  of  affecting  the  value  of  the  analysand. 
In  a  sense,  defining  an  influence  space  is  artificial,  since  if  computational  power  were 
great  enough,  this  space  might  just  as  well  encompass  the  entire  regional  atmosphere 
for  the  current  time  as  well  as  some  considerable  antecedent  time  period.  However, 
in  this  case,  the  second  and  more  difficult  part  of  the  data  selection  problem 
(discussed  below)  would  be  greatly  aggravated. 

The  second  part  of  the  problem,  the  choice  of  potential  observations  which  will  affect 
the  analyzed  value,  requires  substantial  investigation.  Several  investigators  have 
found  that  the  use  of  four  or  five  parameters  is  generally  sufficient  to  obtain  the  best 
estimate  (lowest  root  mean  square  errors  [RMSE])  for  the  interpolated  value.  If  a 
smaller  or  larger  number  of  observations  is  allowed  to  influence  the  analysand,  the 
error  generally  increases.  Thus,  it  appears  that  the  number  of  potential  observations 
should  be  large,  but  the  number  actually  selected  for  the  interpolation  should  be 
small. 

Of  course  it  is  possible  to  avoid  the  problem  of  how  to  select  a  few  "best'' 
observations  from  many  within  the  influence  space,  by  using  virtually  all  of  them. 
However,  as  we  have  already  indicated,  this  too  is  unlikely  to  be  the  best  procedure, 
in  terms  of  both  the  accuracy  and  stability  of  the  results  as  well  as  in  terms  of  the 
computational  effort  involved.  Stepwise  regression  has  been  used  for  many  years  as 
a  possible  solution  to  the  selection  process  in  multivariate  regression. 


While  stepwise  regression  docs  not  guarantee  the  best  selection  of  "predictors,"  a  large  body 
of  experience  has  demonstrated  its  effectiveness  as  a  practical  regression  technique  that  can 
approach  the  "best"  solution  when  properly  applied.  Stepwise  regression  avoids  the  arbitrary 
limitation  and  selection  procedure  of  the  National  Meteorological  Center  (see  Hoke  ct  al.,19S9)  and, 
at  the  same  time,  has  the  advantage  of  a  large  pool  of  potential  "predictors"  but  is  required  to  invert 
only  relatively  small  matrices.  Since  the  typical  matrix  inversion  in  a  Ol  scheme  that  uses  stepwise 
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regression  would  typically  involve  three  to  six  variables,  it  would  appear  that  stepwise  regression 
might  be  less  costly,  computationally,  than  operational  OI  systems,  as  will  be  shown. 

First,  two  typical  examples  are  discussed  where  26  randomly  distributed  artificial 
"observations"  are  available.  The  correlations  between  the  preliminary  field  departures  required  to 
determine  the  weights  assigned  to  the  observations  [r(A; ,  AJ,  r(Aj  ,AJ]  were  specified  by  a  very 
simple  exponential  function  of  the  separation  distances  between  observations  (i,j)  and  between  the 
analysis  gridpoint  and  the  observations  (g,j). 

In  Case  1,  the  correlation  r(A, ,  AJ  varied  from  a  low  of  0.11  to  a  high  of  0.54.  The 
intercorrelations  among  the  observation  points  r(Ait  AJ  varied  from  0.02  to  0.86.  In  this  study,  as 
well  as  in  the  others,  it  was  assumed  that  all  variables  were  normalized  with  a  variance  of  unity. 
Table  2  shows  the  weights  assigned  to  the  preliminary  (first  guess)  field  departures  from  the 
observations  when  all  26  observations  are  used  in  the  analysis  as  well  as  the  weights  when  only  the 
best  five  observations  are  used.  "Best"  here  is  to  be  interpreted  in  terms  of  linear,  least  square 
stepwise  regression. 

It  is  apparent  that  there  is  little  difference  between  the  best  five  W;  whether  all  26  or  only 
these  5  observations  enter  the  regression.  Furthermore,  in  terms  of  multiple  correlation  (r^  i.  k), 
where  k  goes  either  to  5  (for  the  best  five  observations)  or  to  26  (for  all  observations),  the  results 
are  also  virtually  identical;  with  k=5,  the  multiple  correlation  (0.6633)  is  numerically  slightly  greater 
than  with  k=26  (0.6611)’’.  In  spite  of  the  fact  that  the  weight  of  the  sixth  observation  of  the  26 
appears  significant  (0.0474),  the  remaining  W;  (i>7),  which  are  small  and  largely  negative,  appear 
merely  to  be  introducing  noise.  In  this  case  at  least,  it  would  be  preferable  to  stop  the  selection  after 
5  or  6  observations,  rather  than  continuing  to  26. 

In  Case  2,  while  the  details  differ,  the  results  are  similar  to  those  of  Case  I.  The  correlations 
r(Aj ,  AJ  varied  from  0.09  to  0.50,  while  r(A,  ,Aj)  varied  from  0.01  to  0.95.  Table  3  also  shows  the 
similarity  in  the  first  five  selected  observations,  as  well  as  the  small  magnitude  of  the  weights  beyond 


This  is  possible  here,  only  because  of  round-off.  All  of  the  bivariate  correlations  were  rounded 
to  two  decimals. 
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Table  2.  Weights  (Wj)  given  to  the  observations  in  Case  1,  when  all  26  observations  are  used, 
compared  with  the  weights  when  only  the  best  five  observations  are  used.  The  multiple 
correlation  rt ,  ,  t  is  also  shown. 


i 

Weights  (Wj) 

Observation 

Number 

All  Observations 

Used 

Best  Five 

Used 

1 

0.3278 

0.3320 

2 

0.2244 

0.2267 

3 

0.1656 

0.1737 

4 

0.1488 

0.1475 

5 

0.1150 

0.1001 

6 

0.0474 

7 

0.0139 

8 

0.0055 

9 

0.0038 

10 

0.0036 

11 

0.0036 

12 

-0.0012 

13 

-0.0015 

14 

-0.0016 

15 

-0.0027 

16 

-0.0038 

17 

-0.0062 

18 

-0.0065 

19 

-0.0077 

20 

-0.0079 

21 

-0.0082 

22 

-0.0088 

23 

-0.0092 

24 

-0.0093 

25 

-0.0160 

26 

-0.0258 

r*.t.2..k 

0.6611 

0.6633 
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the  sixth  observation.  Again,  in  terms  of  multiple  correlation,  virtually  all  of  the  information  is 
contained  in  the  best  five  observations.  Specifically,  rgI2, ...k  =  0.6545  with  k=5  and  0.6551 
with  k=26. 

These  two  cases  imply  that  if  there  were  a  simple,  rational  procedure  for  selecting  (in  these  cases) 
the  best  five  observations,  it  would  not  be  necessary  to  invert  a  matrix  of  26x26,  but  only  a  5x5. 
Clearly,  the  resulting  analysis  would  be  much  more  computationally  efficient.  Stepwise  regression 
offers  such  a  rational  procedure. 

Another  experiment  illustrates  the  application  of  real,  densely  distributed  surface  data.  In 
these  cases,  actual  analyses  are  made  of  the  sea  level  pressure,  using  an  OI  procedure  based  upon 
multiple  regression  with  all  available  observations,  and  then  a  comparison  analysis,  using  only  a  small 
sample  of  these  observations.  In  both  cases  the  analyses  based  on  all  observations  are  very  close  to 
the  observed  value  at  the  "gridpoint"  (strictly  speaking,  a  pseudo  gridpoint). 

3.3  FORWARD  STEPWISE  REGRESSION 

To  determine  if  a  stepwise  regression  scheme  might  be  useful,  a  reanalysis  (that  is,  a 
succeeding  analysis  that  makes  use  of  the  weights  calculated  in  the  original  analysis)  was  performed 
using  the  OI  scheme  but  with  carefully  selected  observations.  In  OI  the  observations  that  receive  very 
small  or  even  negative  weights  apparently  contribute  little  information  to  an  analysis.  Therefore,  any 
observations  that  had  been  weighted  with  values  less  than  0.01  were  dropped  from  the  reanalysis. 
The  resulting  reanalysis  on  average  used  only  six  observations  (compared  to  20  observations  used 
in  the  original  analysis)  with  three  of  these  often  providing  most  of  the  information  (that  is,  these 
three  observations  had  more  than  95  percent  of  the  total  weight).  The  reanalysis  used  more  than 
nine  observations  at  only  0.5  percent  of  the  pseudo  gridpoints.  In  those  cases  where  more  than  six 
observations  were  retained,  the  weights  of  the  additional  observations  were  on  the  order  of  0.01.  So, 
given  a  physically  reasonable  structure  function,  OI  can  produce  an  excellent  analysis  with  only  a  few 
carefully  selected  observations. 

In  each  of  the  experiments  the  multiple  correlation  between  the  gridpoint  and  the  retained 
observations  did  not  change  significantly  (more  than  0.003),  whether  20  observations  were  part  of 
the  OI  scheme  or  only  the  selected  observations  were  included.  In  those  cases  where  the  multiple 
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Table  3.  Weights  (Wj)  in  Case  2,  when  all  26  observations  are  used,  compared  with  the  weights 
when  only  the  best  five  observations  are  used.  The  multiple  correlation  rfrJ  j  k  is  also 
shown. 

i 

Weights  (W;) 

Observation 

All  Observations 

Best  Five 

Number 

Used 

Used 

1 

0.2871 

0.3140 

2 

0.2449 

0.2412 

3 

0.2084 

0.1993 

4 

0.1229 

0.1087 

5 

0.1135 

0.1286 

6 

0.0445 

7 

0.0202 

8 

0.0020 

9 

0.0008 

10 

0.0004 

11 

-0.0006 

12 

-0.0014 

13 

-0.0018 

14 

-0.0021 

15 

-0.0025 

.  16 

-0.0032 

17 

-0.0050 

18 

-0.0054 

19 

-0.0055 

20 

-0.0082 

21 

-0.0083 

22 

-0.00% 

23 

-0.0101 

24 

-0.0118 

25 

-0.0124 

26 

-0.0133 

r*.l.Z.  ,k 

0.6551 

0.6545 

correlation  was  slightly  lower,  because  fewer  observations  entered  the  OI  analysis,  the  resulting 
rcanalysis  at  a  gridpoint  was  better  more  often  than  not. 
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Specifically,  in  one  experiment  with  214  real  observations,  each  of  which  served  as  a  pseudo 
gridpoint,  for  41  points  the  multiple  correlation  decreased  more  than  0.001  but  less  than  0.003  when 
the  OI  scheme  used  only  the  selected  (six  on  average)  observations.  In  those  41  events,  the  analyzed 
value  was  more  accurate  26  times  and  less  accurate  15  times.  The  root-mean-squarc  (rms) 
difference  between  the  observed  values  at  all  214  points  and  the  analyzed  value  at  those  points  was 
1.0062,  which  is  slightly  greater  than  the  rms  difference  of  0.9648  resulting  from  the  reanalysis.  But 
the  reanalysis  reduced  the  bias  of  the  average  differences  by  a  factor  of  3. 

It  is  apparent  not  only  from  the  examples  illustrated  here  but  also  from  a  large  body  of  both 
theory  and  experience  that  the  information  contained  in  a  large  number  of  inter-related  "predictor" 
variables  can  be  closely  approximated  by  a  relatively  small  number  of  these  variables  or,  what 
amounts  to  essentially  the  same  thing,  a  small  number  of  transformed  or  factorized  variables. 
Stepwise  regression  is  a  simple  procedure  that  produces  results  close  to  that  of  orthogonal 
transformation,  but  with  far  less  computational  effort.  Because  of  the  favorable  results  and  the  ease 
and  simplicity  of  computation,  stepwise  regression  is  ideally  suited  for  selecting  the  relevant  and 
significant  observations  from  the  larger  body  of  available  observations,  and  at  the  same  time 
evaluating  the  weights  (Keegan  and  Shapiro,  [1985]  show  that  the  selection  scheme  rigorously 
accounts  for  the  intercorrelations  among  observations,  and  that  it  is  a  conceptual  error  to  select 
observational  data  solely  on  the  basis  of  the  correlations  between  the  observation  and  the  analysand.) 

While  a  variety  of  forms  of  stepwise  regression  are  in  use,  a  simple  forward  searching 
procedure  seems  appropriate  for  this  application.  Although  existing  computer  programs  may  differ 
in  detail,  a  simple  forward  stepwise  regression  (FSR)  proceeds  essentially  as  outlined  below. 

Let  N  represent  the  number  of  potential  observations  available  to  specify  an  analyzed  value 
of  the  parameter  at  gridpoint  g.  A  specified  function  has  been  assumed  that  determines  (for 
example,  as  a  function  of  the  distance  of  separation)  the  intercorrclations  among  the  N  variables  as 
well  as  between  the  gridpoint  parameter  and  each  of  the  N  variables. 

In  the  first  step,  the  observation  having  the  largest  correlation  with  the  gridpoint  parameter 
(G)  is  selected.  In  general  this  will  be  the  closest  observation  and  is  designated  observation  A. 
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In  the  second  step  (N-l),  multiple  correlation  coefficients  are  obtained  with  G  as  the 
dependent  variable  and  with  A  and,  in  turn,  with  each  of  the  remaining  (N-l)  observations  as 
independent  variables.  Each  of  these  multiple  correlations  involves  A  and  a  different  observation 
as  the  independent  variables.  The  pair  of  observations  yielding  the  largest  multiple  correlation  is 
then  selected.  The  second  variable  of  this  pair  is  designated  as  B. 

In  the  third  step  (N-2),  multiple  correlations  are  obtained  between  G  and  three  independent 
variables  A,  B,  and  each  of  the  remaining  (N-2)  observations  in  turn.  The  triplet  of  observations 
yielding  the  largest  multiple  correlation  is  selected. 

The  FSR  process  continues  in  this  manner  until  some  threshold  is  reached.  This  threshold 
is  generally  specified  in  terms  of  explained  variance  (the  square  of  the  multiple  correlation 
coefficient).  Say  that  k  observations  have  been  selected  by  the  process  outlined  above.  The 
selection  process  stops  when  the  square  of  the  multiple  correlation  with  k  independent  variables  does 
not  exceed  that  with  (k-1)  variables  by  a  specified  amount  e  ,  where  e  is  typically  0.01  or  less.  That 

is,  (k-1)  observations  are  used  when  r2  ( k )  -r2  (k-1)  <e 

3.4  ERROR  CORRELATION  STRUCTURE  FUNCTIONS 

From  the  preceding  subsections,  it  is  dear  that  the  analysis  technique  relies  heavily  on  the 
capability  to  develop  physically  realistic  error  correlation  functions.  These  functions  determine  those 
observations  that  will  influence  an  analysis  at  a  point  and  how  strong  that  influence  will  be. 

The  horizontal  correlation  coefficients  will  be  fitted  to  the  following  structure  functions 
(Mitchell  et  al.,  1990)  where  sea  level  pressure  =  P,  temperature  =  T,  dewpoint  =  Td,  humidity 
=  Q,  u-component  and  v-componcnt  of  wind  =  U  and  V,  respectively,  and  height  of  mandatory 
level  =  UTC.  For  horizontal  correlations  of  meteorological  elements  Z,  T,  Q  (or  Td),  and  P  (or  Z), 
the  raw  correlations,  r(As  ,Aj)  will  be  fitted  to  the  function 

HCORjj  =  Rk(l+<7)-'{  [1  +  ckdj,  +  c2k(d, ,)z/3]cxp(-ckd < ,)  +  o[l  +  c3kd,/N  +  c*(d„)I/3N*  Jcxpf-c.d./N)  } 

where  o  =  0.2,  d,j  is  the  distance  between  two  observation  points  i  and  j,  and  N  =  3.  The  constants 
Rb  and  ck,  both  of  which  are  different  for  each  element,  will  be  calculated  from  RWFM  forecast 
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error  statistics,  as  discussed  in  Section  4.5,  and  fitted  with  the  IMSL  (PL  VAX  Library)  routine 
RNLIN.  For  winds  the  data  will  be  fitted  (using  the  IMSL  routine  RNLIN)  to 

HCORjj  =  Rv(l+o/N2)1{  [1  +  cvdiJexp(-cvdij)4-  a/N*[l  +  c.di/N  ]  exp(-cvdij/N)  } 

where  a  =  0.2  and  N  =  3,  and  the  constants  R*  and  c*  will  also  be  calculated  from  RWFM  forecast 
error  statistics,  as  discussed  in  Subsection  4.5.1. 

For  the  vertical  dimension  and  time,  the  U.S.  Air  Force  Global  Weather  Central  Tech  Note 
(1986)  uses  the  following  notations.  The  vertical  correlation  is  represented  by 

VCOR  =  {  [1  +  C,  LN  [P(1)/P(2)J  ]2  }•* 

where  Q  is  a  positive  constant  determined  from  data  discussed  in  Subsection  4.5.2,  P(l)  is  the 
pressure  at  point  #1,  and  P(2)  is  the  pressure  at  point  #2.  If  this  simple  function  is  sufficient  to  fit 
the  data,  it  will  be  used;  otherwise,  a  more  sophisticated  approach  will  be  followed.  The  time 
correlation  function  will  probably  be  of  the  form 

TMCOR  =  e-^ 

where  C  is  a  positive  constant,  determined  from  RWFM  forecast  error  statistics  discussed  in 
Subsection  4.5.3,  and  AT  is  the  absolute  time  difference  either  between  two  observations  or  between 
observations  and  analysis  time.  The  total  correlation  TCOR  is  assumed  to  be  the  product  of  the 
space  and  time  correlations,  that  is, 

TCOR  =  VCOR  *  HCORj,  *  TMCOR 

The  total  correlation  (TCOR)  then  replaces  r^  in  the  horizontal  system  of  linear  equations  used  to 
calculate  the  weights  W;. 

These  analytic  functions  will  model  the  intercorrelations  and  cross-correlations  of  errors  at 
observation  points  and  the  correlation  and  cross-correlations  between  an  observed  variable  and  the 


first  guess  forecast  at  a  grid  point  The  correlations  will  determine  by  a  stepwise  regression  the 
observations  that  will  be  interpolated  to  a  gridpoint  in  the  OI  scheme. 

3.5  BUDDY  CHECK 

RAP  project  members  quickly  recognized  the  importance  of  developing  an  error  checking 
procedure  as  an  integral  part  of  the  observational  data  selection.  To  create  such  a  procedure,  they 
developed  a  gross  error  check  scheme  for  eliminating  obviously  erroneous  data  and  a  buddy  check 
for  a  more  refined  analysis.  The  latter  scheme  compares  each  observation  with  the  value  obtained 
by  interpolating  to  the  observation  point  without  using  the  datum  itself.  Both  schemes  are  univariate 
and  two  dimensional  operating  on  mandatory  levels  one  at  a  time. 

For  every  point  i,  the  buddy  check  scheme  first  selects  nearby  observation  points  and 
computes  an  analyzed  value  at  that  point  by  using  the  stepwise  regression  and  optimum  interpolation 
as  described  in  Section  3.2.  Then  the  scheme  calculates  analysis  error,  defined  by  the  expression 
dj  =  Aj  -  O;,  where  d„  A,  and  O;  are  the  analysis  error,  analysis  and  the  observed  value,  respectively, 
at  point  i. 

After  the  calculation  of  d;,  Aj,  and  Oj  for  all  observations  points,  the  scheme  determines  the 
average  analysis  error,  cT7,  and  the  standard  deviation,  o,.  Any  observed  value  with  an  absolute 
analysis  error,  defined  as  ef  =  Id^l  ,  exceeding  2.5  standard  deviations  becomes  a  candidate  for 
rejection,  and  the  scheme  flags  this  value. 

The  scheme  scrutinizes  each  flagged  value  A*  by  determining  its  effect  on  every  analysis  value, 
A  j  affected  by  A  f  (i.e.,  those  points  where  the  analysis  uses  0(  as  one  of  the  observations  to 
interpolate).  To  recheck  each  Aj,  the  scheme  first  reanalyzes  every  Aj  by  excluding  the  flagged 
value  Oj.  Next  the  scheme  calculates  thed7  as  the  mean  analysis  error  for  the  subfield  that  excludes 
all  flagged  points  and  the  weighted  difference  D, ,  defined  by  the  expression 

o,  n, 

where  W*  =  Cj  /  (2.5*Oi+  (17),  n*  *s  l^e  number  of  points  used  to  calculate  A ,,  A,'  is  A,  recalculated 
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by  excluding  Oj,  and  O,  is  the  observed  value  at  point  j.  The  factor  W;  exceeds  unity  for  all  flagged 
values  and  magnifies  the  effect  of  the  absolute  analysis  errors. 

If  Dj  is  positive,  then  removal  of  Oj  reduces  the  average  analysis  error  at  all  points  j,  and  Oj 
sufficiently  degrades  the  analyses  at  those  points  to  warrant  rejection.  Otherwise,  if  Dj  is  negative, 
excluding  O,  will  make  the  analyses  at  the  neighboring  points  worse  and,  therefore,  the  scheme 
removes  the  flag  and  preserves  the  observed  value. 


4.  DATABASE  SPECIFICATION 

Much  of  the  effort  expended  on  RAP  has  been  on  the  development  of  extensive  databases. 
These  include  databases  of  analyses,  forecasts,  observations,  forecast  error  correlations,  and  data 
required  for  observing  system  simulation  experiments.  Each  of  these  databases  is  described. 

4.1  ANALYSIS  DATABASE 

All  HIRAS  data  needed  to  initialize  the  GSM  or  to  verify  the  RWFM  or  RAP  was  saved  on 
magnetic  tapes.  There  are  62  Files  per  month  for  July  1988  and  January  1989.  Each  File  has  15 
mandatory  levels  and  10,585  (145x73)  gridpoints  per  level.  The  15  levels  are  at  1,000  mb,  850  mb, 
700  mb,  500  mb,  400  mb,  300  mb,  250  mb,  200  mb,  150  mb,  100  mb,  70  mb,  50  mb,  30  mb,  20  mb, 
and  10  mb.  No  surface  data  except  sea  level  pressure  is  included  because  RAP  can  be  verified 
better  there  with  actual  observations  or  other  techniques.  Die  lower  six  levels  contain  U,  V,  T,  Q 
(both  specific  and  relative  humidity),  and  D-values  (that  is,  the  difference  between  the  measured 
height  and  standard  height  at  the  given  level);  levels  7  through  15  contain  similar  data  except  no 
humidity  is  available.  The  HIRAS  data  needed  for  verification  were  interpolated  onto  the  RWFM 
grid  in  the  windows  in  the  Eurasian  (EU)  and  Central  American  (CA)  windows. 

4.2  FORECAST  DATABASE 

The  GSM  was  initialized  from  HIRAS  on  both  1  January  1989  at  00  UTC  and  1  July  1988 
at  00  UTC  and  at  2.5  day  intervals  thereafter  during  those  months,  yielding  12  independent  36  hr 
forecast  cycles  for  January  and  July.  The  spectral  coefficients  of  the  nonlinear  mode  initialization 
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(NMI)  and  the  6-hr  forecasts  of  surface  pressure,  wind  components,  temperature,  and  humidity  were 
postprocessed  to  a  2.5°  grid  on  mandatory  levels,  and  saved.  Then  two  36  hr  GSM  and  RWFM 
forecasts  were  executed  (the  RWFM  boundary  conditions  came  from  the  GSM  forecasts),  one  in  the 
Eurasian  (EU)  and  the  other  in  the  Central  American  (CA)  regions.  Both  the  GSM  and 
RWFM  forecasts  were  subjectively  and  objectively  compared  to  each  other  and  to  HIRAS  (see 
Appendix  B.)  to  ensure  that  only  acceptable  forecasts  were  in  the  database. 

These  RWFM  forecasts  will  be  interpolated  to  the  RAP  window  on  the  uniform  gridded  data 
field  (UGDF)  grid,  which  will  be  used  as  the  RAP  grid.  To  be  consistent  with  AFGWC,  STC 
implemented  the  RWFM  in  the  operational  mode  of  16  a-level  61x61  grid  with  a  horizontal 
resolution  of  95  km.  The  UGDF  grid  system  has  a  horizontal  resolution  of  50  nm  at  mandatory 
levels  up  to  50  mb;  the  EU  window  is  on  a  polar  stereographic  projection,  and  the  CA  window  is 
on  a  Mercator  projection. 

The  RAP  window  has  40x40  UGDF  grid  boxes,  but  the  RWFM  has  60x60  grid  boxes  (each 
of  which  is  slightly  larger  than  a  UGDF  box).  Thus,  the  RAP  window  can  fit  inside  the  RWFM 
window  such  that  first  guess  forecasts  are  available  several  hundred  miles  beyond  the  RAP 
boundaries.  (These  forecasts  are  needed  to  calculate  errors  of  observations  outside  the  RAP  window 
that  affect  gridpoints  near  a  boundary.) 

The  following  RWFM  forecasts  at  12,  18,  24,  and  36  hr  are  stored  as  RAP  first  guess  fields 
for  both  regions  and  seasons:  on  the  surface,  sea  level  pressure  and  temperature;  at  all  mandatory 
levels  of  temperature,  height,  and  u  and  v  components  of  wind;  and  at  the  lower  six  mandatory  levels 
of  relative  and  specific  humidity. 

4.3  OBSERVATION  DATABASE 

The  development  of  the  observation  database  required  decoding,  sorting,  and  merging  of  data 
extracted  from  34  magnetic  tapes  from  the  USAF  Environmental  Technical  Applications  Center. 
All  observations  were  ordered  by  station  number  with  data  in  daily  sequence  for  the  entire  month. 
The  database  required  synoptic  observations. 
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Observations  closest  to  (-3  to  +1  hr)  00  UTC,  06  UTC,  12  UTC,  and  18  UTC  have  been 
stored  for  use  in  the  CA  and  EU  windows.  The  observation  window  is  approximately  the  size  of 
RWFM  (60x60  grid  boxes,  each  box  95x95  km).  The  observations  were  extracted  by  station  for 
sorting  and  merging  into  the  windows  by  season  and  time. 

The  data  are  surface  observations  of  sea  level  pressure  (P),  temperature  (T),  dewpoint  (Td), 
and  station  elevation;  upper  air  (RAOB)  observations  of  u  and  v  components  of  wind  (U  and  V, 
respectively,),  height  of  mandatory  level  (Z),  T,  and  Td  ;  aircraft  observations  of  U,  V,  T, 
latitude/longitude,  assigned  pressure  level,  and  time;  and  unique  satellite  observations  of  T  and  Z 
in  the  form  of  a  RAOB. 

4.4  OBSERVATION  SYSTEM  SIMULATION  EXPERIMENT  DATABASES 

The  OSSE  databases  form  the  beginning  of  a  complex  data  processing  project.  The  simplest 
and  possibly  most  meaningful  description  of  the  databases  is  shown  in  Fig.  1. 

4.5  ERROR  CORRELATIONS  DATABASE 

From  Section  3.4  there  are  three  separate  error  correlations  required:  horizontal,  vertical, 
and  temporal.  The  correlations  are  calculated  from  the  differences  at  observation  points  between 
forecast  and  observed  values.  These  differences  (the  A,s  from  Eq.  2)  are  calculated  on  each  of  the 
mandatory  levels  after  horizontal  interpolation  of  forecast  (first  guess)  values  from  gridpoints  to 
observation  points,  using  the  four-point  restorer  scheme  (Shapiro,  1978).  Observations  that  match 
the  times  of  RWFM  forecasts  are  interpolated  vertically  if  necessary  to  mandatory  levels  (1,000  to 
50  mb)  for  January  1989  and  July  1988  in  the  CA  and  EU  windows.  Observations  of  Z,  T,  Td ,  U, 
and  V  not  on  mandatory  levels  are  interpolated  linearly  in  In  P  to  the  closer  mandatory  level. 

Let  A,  =  -  f°,  be  the  forecast  error  at  an  observation  point  i,  where  from  Eq.  1 ,  f p,  is  the 

preliminary  (first  guess  forecast)  value  of  a  meteorological  element  at  observation  point  i,  and  f°  is 
the  observed  value  at  observation  point  i.  The  A  $  can  be  calculated  for  each  of  the  forecasts  in  the 
database  of  Section  4.2,  given  the  observed  elements  in  Section  4.3.  They  arc  defined  only  when 
forecast  and  observed  elements  are  both  available  at  nearby  times  and  locations. 
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4.5.1  Horizontal  Error  Correlations 

The  AjS  are  a  function  of  forecast  length,  mandatory  level,  month,  instrument,  distance 
between  and  relative  orientation  of  observations,  and  window.  Table  4  illustrates  a  general  error 
database,  which  for  horizontal  correlations  requires  the  A*s  at  specific  distance  intervals  and  direction 
vector  intervals  between  observations  pairs.  The  mean  (M,)  and  the  standard  deviation  (<7,)  of  each 
error  variable  is  calculated  and  stored  by  summing  the  AjS  over  the  12  forecasts  for  each  level, 
month,  window,  and  forecast  length  for  all  observation  points  (i),  ignoring  any  point  that  has  less 
than  eight  observations. 

M,  =  (ZjAj)/n  and  a,  =  {  [l/(n-l)]  *  (XA2  -  M,2*n]  }1/2 
where  n  is  the  number  of  forecasts  with  observations  that  match  in  time  and  8  <  n  <  12. 

The  error  at  each  observation  point  in  the  set  is  correlated  with  the  error  at  the  other 
observation  points  for  all  variables.  Let  r^  be  the  (univariate)  autocorrelation  coefficient, 

=  (A(*A,  -  A>A (4) 

for  observation  point  i  and  observation  point  j  of  the  same  meteorological  error  element,  where  the 
overbar  is  an  average  over  n.  Note  that 

(A,*A,  =  Ay*A.) 

Similarly,  for  different  meteorological  elements,  say  A  and  B,  a  (bivariate)  cross-correlation 
can  be  written 


=  (y^a,  -  w 

Af>>  (<vo»/) 


(5) 


observed  clement  (identified  if  measured  by  aircraft  or  satellite)  in  the  database  will  be  calculated 
where  Eq.  4  is  the  special  case  of  a  =  b  from  the  more  general  Eq.  5.  (Both  equations  can  be  used 
to  calculate  any  type  of  correlation  between  two  elements.)  For  both  windows  and  months  on  all 
levels  at  all  forecast  lengths,  the  univariate  and  bivariate  horizontal  correlation  coefficients  for  each 
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Table  4.  The  generalized  error  database  at  observation  point  i  for  some  specified  level,  month, 
meteorological  element,  and  window 


Initial  Forecast 

+  12-hr 

+  18-hr 

+24-hr 

+36-hr 

1  /  00  UTC 

4(1.1) 

• 

• 

4(1.4) 

3  / 12  UTC 

4(2,1) 

• 

• 

4(2,4) 

6  /  00  UTC 

• 

• 

• 

• 

8  /  12  UTC 

. 

, 

28/12  UTC 

4(12,1) 

- 

- 

A, (12, 4) 

and  stored  for  each  unique  pair  of  observations  (i,j).  The  correlations  will  be  stored  by  instrument, 
month,  length  of  forecast,  and  mandatory  level  into  bins,  which  organize  paired  observations  by  the 
distance  between  them  (Thiebaux  et  al.,  1986)  in  50-km  intervals  out  to  2,500  km.  In  addition  to 
placing  the  correlation  pairs  into  bins  that  are  a  function  of  only  the  distance  between  the  pairs,  the 
correlations  are  also  "binned"  according  to  the  direction  of  the  line  connecting  the  paired  points. 
The  purpose  of  treating  the  line  as  a  vector  is  to  allow  a  determination  of  the  isotropy  of  error 
correlation.  Thus,  there  will  be  a  group  of  horizontal  correlation  coefficients  arranged  as  shown  in 
Table  5. 

This  database  leads  to  a  complex  horizontal  error  correlation  model.  The  approach, 
however,  is  to  let  the  data  speak  for  themselves;  therefore,  no  univariate  or  multivariate 
combinations  of  meteorological  elements  can  be  eliminated  arbitrarily  from  consideration. 
Consequently,  horizontal  error  correlations  are  calculated  for  Z-Z,  T-T,  U-U,  V-V,  Q-Q,  Z-T,  Z-U, 
Z-V,  Z-Q,  T-U,  T-V,  T-Q,  U-V,  U-Q,  and  V-Q.  In  addition,  the  multivariate  correlations,  say 
r(A,B),  are  calculated  for  both  r(A,B)  and  r(B,A).  Also,  there  are  four  instruments  to  consider: 
aircraft,  rawinsondes,  and  two  satellites.  Finally,  there  are  four  quadrants  to  account  for  the  direction 
of  the  vector  connecting  two  observation  points,  four  forecasts  lengths,  two  windows,  and  two 
seasons. 
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Table  5.  The  horizontal  correlation  database  for  windows,  seasons,  and  levels  for  each  observed 
unique  pair  of  observations 

Distance  (km) 

0-50  51-100  101-150  •  •  •  2451-2500 

Instrument  ... 

Europe  ... 

Central  America  ... 

12-hr  forecast  ...  • 

18-hr  forecast  ... 

24-hr  forecast  ...  • 

36-hr  forecast  ...  • 

January  ...  • 

July  ..... 

Quadrant 

NE  ..... 

NW  ..... 

SW  ..... 

SE  ..... 


Thus,  the  horizontal  error  correlation  model  is  a  function  of  several  thousand  variables! 
Undoubtedly,  the  model  will  be  simplified  because  many  of  these  variables  will  not  be  unique; 
nevertheless,  the  data  will  be  allowed  to  speak  for  themselves. 

4.5.2  Vertical  Error  Correlations 

Similarly,  the  univariate  and  bivariate  vertical  error  correlation  coefficients  of  all  the 
meteorological  elements  (Z-Z,  T-T,  U-U,  V-V,  Q-Q,  Z-T,  Z-U,  Z-V,  Z-Q,  T-U,  T-V,  T-Q,  U-V,  U- 
Q,  and  V-Q)  wall  be  calculated  and  averaged  over  all  radiosonde  sites  in  the  database.  Satellite  data 
are  limited  to  error  correlations  of  Z-Z,  T-T,  and  Z-T.  Of  course,  in  this  case  there  are  no  bins  for 
the  distance  between  observation  pairs  or  the  direction  vector  between  them.  These  correlations  arc 
being  computed  rigorously  for  all  mandatory  levels,  as  shown  in  Table  6. 


Consider  the  vertical  error  correlation  of  element  A  on  level  k  with  element  B  on  level  1  at 
a  rawinsonde  observation  point,  rkl(AAk,AB|)  [for  example,  r(AA450.AB700),  where  A  is  on  the  850-mb 
level  and  B  is  on  the  700-mb  level].  The  A,s  are  calculated  as  shown  in  Table  4,  and  the  correlations 
are  calculated  from  Eq.  5.  The  vertical  correlations  are  functions  of  several  variables:  the 
instrument;  the  months  of  January  and  July;  the  EU  and  CA  windows;  and  the  four  forecasts:  12  hr, 
18  hr,  24  hr,  and  36  hr).  The  averages  are  stc:ed  in  Table  6. 

4.5.3  Temporal  Error  Correlations 

Finally,  the  univariate  and  bivariate  temporal  error  correlation  coefficients  of  all  the 
meteorological  elements  will  be  calculated  and  their  averages  stored  for  Z-Z,  T-T,  U-U,  V-V,  Q-Q, 
Z-T,  Z-U,  Z-V,  Z-Q,  T-U,  T-V,  T-Q,  U-V,  U-Q,  and  V-Q.  In  addition,  the  multivariate  correlations, 
say  r(A,B),  are  calculated  for  both  r(A,B)  and  r(B,A).  This  is  a  comparatively  simple  (because  there 
are  only  four  forecast  times)  calculation  of  correlating  the  A  ;s  at  a  given  forecast  time  with  the  other 
forecast  times. 

Consider,  for  example,  meteorological  elements  A  and  B  (recall  that  A  =  B  for  univariate 
correlations),  the  temporal  error  correlation  is  given  by  the  general  expression  r„,(AA.,AB,)  where 
m  is  the  m^-hr  forecast  and  n  is  the  n^-hr  forecast.  Specifically,  there  are  six  unique  temporal  error 
correlations:  rx(AA121l,AB18h),  rT(AA12h,AB2411),  r-^AA.^AB*,,),  rT(AAlgh,AB24ll),rT(AAlgh,AB3<,b),  and 
r^AA^k.AB^).  These  correlations  require  the  AjS  from  Table  4  and  are  calculated  from  Eq.  5. 
The  temporal  error  correlations  are  functions  of  the  same  variables  as  are  the  vertical  correlations. 
The  variables  are  the  rawinsondes,  the  months  of  January  and  July,  the  EU  and  CA  windows,  and 
the  mandatory  levels.  The  average  correlations  are  stored  as  shown  in  Table  7. 

5.  CASE  STUDIES  OF  NUMERICAL  ANALYSIS  EXPERIMENTS 

The  OI  experiments  with  real  data  have  yielded  many  interesting  and  relevant  results.  The 
numerical  experiments  discussed  in  Section  3  that  used  simulated  data  were  theoretically  interesting 
and  useful;  however,  real  data  offered  more  robust  experiments  and  practical  results.  Simulated 
error  correlation  coefficients  were  still  required,  however,  because  the  models  being  developed  under 
STC  Task  2  were  not  ready  for  use. 
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Table  6.  The  average  over  all  rawinsonde  sites  of  the  vertical  error  correlations  of  all  observed 
meteorological  elements  at  the  given  level  with  the  levels  above  for  each  window  (Eurasia 
and  Central  America),  month  (January  and  July),  a  -  J  forecast  hour  (1?  IS,  2*  and  36). 


Standard  Levels  (mbs) 

1000 

1000 

1000 

1000 

1000 

1000 

1000 

1000 

1000 

1000 

1000 

850 

700 

500 

400 

300 

250 

200 

150 

100 

70 

50 

850 

850 

850 

850 

850 

850 

850 

850 

850 

850 

700 

500 

400 

300 

250 

200 

150 

100 

70 

50 

700 

700 

700 

700 

700 

700 

700 

700 

700 

500 

400 

300 

250 

200 

150 

100 

70 

50 

500 

500 

500 

500 

500 

500 

500 

500 

400 

300 

250 

200 

150 

100 

70 

50 

400 

400 

400 

400 

400 

400 

400 

300 

250 

200 

150 

100 

70 

50 

300 

300 

300 

300 

300 

300 

250 

200 

150 

100 

70 

50 

250 

250 

250 

250 

250 

200 

150 

100 

70 

50 

200 

200 

200 

200 

150 

100 

70 

50 

150 

150 

150 

100 

70 

50 

100 

100 

70 

50 

70 

50 

Table  7.  The  average  temporal  forecast  error  correlation  database 
instruments,  mandatory  levels,  and  observed  elements. 

for  windows,  months, 

Forecast  Hour 

12“  18“  24“ 

36“ 

Europe 

Central  America 

January 

July 

Mandatory  Levels 
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The  error  correlation  model  followed  Thiebaux  et  al.  (1986),  who  found  that  the  second 
order  autoregressive  function  called  SOAR  worked  well  with  meteorological  data.  Frankc  (1990) 
used  their  general  expression 

C(s)  =  (l+AXl+as^**  +  A 

where  s  denotes  distance,  and  a  and  A  are  parameters  determined  by  fitting  the  data. 

Our  experiments  used  a  very  simple  version  of  SOAR  to  compute  the  correlation  coefficient 
required  in  Eq.  3  between  observation  points  1  and  2,  rx2  =  e~° d,  where  d  is  the  distance  in 
kilometers  between  the  two  points,  and  a  is  a  scaling  parameter  arbitrarily  determined  such  that 
r12  =  0.01  at  the  extent  of  the  chosen  radius  of  influence.  Thus,  in  this  simple  version  of  SOAR 
no  real  data  are  fitted  to  determine  a.  (For  example,  when  d  =  500  km,  a  =  0.0092,  and  when 
d  =  1000  km,  a  =  0.0046.)  So  from  rl2  =  e“’ d,  the  simulated  correlation  coefficients  were 
calculated  as  a  function  of  the  distance  between  observations  and  the  selected  point,  and  the 
intercorrelations  were  calculated  as  a  function  of  the  distance  between  the  observations. 

The  results  of  Section  3.2  provided  STC  with  insight  into  data  selection  techniques  by  showing 
how  well  an  OI  analysis  at  observation  points  matched  verifying  observations  there.  The  results  in 
turn  suggested  several  follow-on  experiments.  A  description  of  the  most  noteworthy  case  studies, 
their  results,  and  conclusions  follow. 

5.1  ANALYSIS  AT  PSEUDO  GRIDPOINTS 

First,  a  database  of  the  sea  level  pressure  in  two  windows  was  prepared.  The  database 
identified  the  latitude  and  longitude  of  all  surface  observation  points  in  AFGWC  regions  in  the  EU 
and  CA  windows  at  a  time  for  which  a  24-hr  RWFM  forecast  was  available. 

Using  the  four-point  restoring  interpolation  scheme  of  Shapiro  (1978),  the  first  guess  field 
(from  the  24  hr  RWFM  forecast)  at  gridpoints  was  interpolated  to  each  observation  point.  The  four- 
point  restorer  scheme,  which  interpolates  from  16  surrounding  gridpoints  on  a  plane  to  an 
observation  point,  interpolates  data  on  a  grid  to  an  observation  point.  The  advantage  of  this  scheme 
is  that  it,  as  a  high  order,  linear  interpolation  scheme,  corrects  the  phase  distortion  and  amplitude 
damping  of  ordinary  linear  (two-point)  interpolation.  Also,  the  restoring  interpolation  operator  is 
computationally  simpler  and  more  accurate  than  cubic  splines  for  interpolation  from  a  uniform  grid. 
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The  forecast  error,  A;  (defined  here  as  observed  pressure  minus  interpolated  pressure),  was 
calculated  for  all  observation  points  (i).  Then  the  iterative  analysis  procedure  for  the  window 
selected  one  at  a  time  all  observation  points,  each  of  which  served  as  a  "pseudo"  gridpoint  for 
purposes  of  this  experiment.  Each  point  and  up  to  20  of  the  closest  observations  within,  say, 
500  km  (or  some  other  given  radius  of  influence)  of  this  point  made  up  a  set  of  observations. 

Each  of  these  initial  numerical  analysis  experiments  in  optimum  interpolation,  taking  the  lead 
from  Sections  3.1  and  3.2,  had  two  parts,  the  second  following  from  the  first:  (1)  an  original  analysis 
at  an  observation  point  that  used  up  to  20  of  the  closest  observations  within  an  arbitrary  radius  of 
influence,  and  (2)  a  follow-on  analysis  at  each  point  that  used  a  selected  subset  of  the  observations 
from  the  original  analysis  at  the  point. 

The  original  analysis  at  a  selected  point  was  calculated  from  the  interpolated  first  guess  at 
that  point,  plus  the  sum  of  the  product  of  the  weights  and  the  errors  at  up  to  20  of  the  closest 
observation  points  (see  Eq.  1).  The  weights  were  calculated  by  inverting  the  correlation  matrix  from 
Eq.  3.  The  procedure  stopped  after  an  analysis  was  performed  at  all  observation  points  (which  are 
treated  as  pseudo  gridpoints).  This  analysis  is  called  OI-l.  The  analyzed  values  were  compared  with 
the  observed  values  at  all  points.  The  rms  errors  and  average  errors  of  forecasts  and  analyses 
determined  the  goodness  of  the  OI  analysis  scheme. 

The  second  part  of  this  experiment  is  a  follow-on  analysis  performed  at  each  observation 
(pseudo  gridpoint).  This  second  analysis  used  only  observations  from  the  original  analysis  at  points 
with  weights  W;  >  0.01  (see  Tables  2  and  3)  to  influence  the  point  to  which  observations  were  being 
interpolated.  In  other  words,  for  the  second  analysis,  which  is  called  OI-2,  Eq.  1  was  solved  for  a 
restricted  subset  of  the  N  observations.  The  effect  was  to  use  the  set  of  observations  that  maximized 
the  information  provided  to  the  analysis  from  a  limited  number  of  observations. 

This  experiment  was  clearly  not  designed  as  a  candidate  for  an  operational  numerical  analysis 
scheme.  The  purpose  was  to  show  that  a  greatly  reduced  subset  of  observations  available  to  perform 
an  analysis  at  a  point  is  sufficient  to  provide  an  analysis  nearly  equivalent  to  one  produced  by  using 
all  available  observations  within  some  chosen  radius  of  influence.  Some  of  the  most  pertinent  results 
of  these  experiments  follow.  Given  several  good  observations  that  arc  evenly  distributed,  the  OI 
scheme  produced  an  excellent  analysis  (even  with  a  poor  first  guess  and  a  simple  structure  function). 
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In  Europe,  where  most  surface  observations  are  close  together,  the  two  or  three  observations 
with  the  highest  correlations  (between  the  gridpoint  and  observations  as  well  as  intercorrelations 
between  the  observations)  are  heavily  weighted,  so  01  in  effect  ignored  observations  with  small 
weights,  as  required  by  Eq.  1.  The  experiment  first  used  the  20  closest  observations  within  500  km 
of  a  selected  point  for  the  OI  analysis,  and  invariably  three  or  less  observations  carried  nearly  all  the 
weight.  Even  on  the  boundary  or  at  remote  points,  where  as  few  as  eight  observations  were  within 
500  km,  less  that  half  of  those  observations  had  much  effect  on  the  analysis. 

Other  experiments  also  yielded  results  that  confirmed  intuitive  expectations  but  needed 
documentation  nonetheless.  For  example,  an  obviously  erroneous  observation,  which  had  a  surface 
pressure  of  911  mb  in  a  pressure  field  that  averaged  1,010  mb,  got  into  the  database  (due  to 
incomplete  error  checking  at  that  time).  Whenever  that  observation  was  close  to  a  selected  point 
and  therefore  was  given  a  high  correlation,  the  analyzed  value  from  OI  was  bad,  no  matter  how  good 
the  first  guess  was.  This  example  pointed  out  the  critical  requirement  for  using  a  gross  error  check, 
and  it  suggested  the  need  for  using  a  buddy  check. 

Experiments  in  the  CA  window,  a  data-sparse  region  compared  to  Eurasia,  illustrated  how 
OI  performed  with  less  data.  The  first  guess  forecast  in  July  is  excellent;  consequently,  the  OI 
analysis  scheme  is  challenged  to  make  a  noticeable  improvement  from  the  preliminary  field.  The 
average  difference  between  the  first  guess  field  interpolated  to  the  211  observation  points  and  the 
observed  values  was  only  0.59  mb.  Experiments  with  three  different  radii  of  influence  were 
conducted:  the  standard  radius  of  influence  of  500  km,  one  shortened  to  250  km,  and  one  extending 
to  750  km.  The  purpose  was  to  check  how  additional  observations  would  improve  the  analysis. 

1.  250-km  radius  of  influence  from  gridpoint  to  the  observation 

Only  four  observations  on  average  entered  the  analysis  at  a  gridpoint  for  this  case. 
The  average  difference  between  the  analysis  and  the  observed  values  was  0.415  mb. 
The  second  analysis,  which  typically  used  60  percent  of  the  observations  in  the 
original  analysis,  had  an  average  difference  of  0.417  mb. 


2.  Radius  of  influence  extending  out  to  500  km 

At  this  radius  of  influence,  however,  a  mean  of  12  observations  were  included  in  the 
OI.  The  average  difference  between  the  original  analysis  and  the  observed  values 
was  0.242  mb.  The  second  analysis,  which  selected  only  one-third  of  the  observations 
from  the  original  analysis,  had  an  average  difference  of  0.236  mb.  The  rms 
differences  were  essentially  the  same  too. 

3.  Radius  of  influence  extending  out  to  750  km 

The  Ol  scheme  used  15  observations  on  average.  This  approach  improved  the 
analysis  slightly  (but  not  significantly)  by  reducing  the  bias;  however,  the  rms 
differences  were  virtually  the  same  as  Case  2.  Given  an  error  correlation  model  that 
decreases  exponentially  with  increasing  distance,  it  was  predictable  that  extending  the 
radius  of  influence  beyond  some  optimum  distance  would  result  in  little  improvement 
of  an  analysis.  These  expectations  are  now  confirmed  in  practice. 

5.2  SELECTING  OBSERVATIONS  BY  FORWARD  STEPWISE  REGRESSION 

From  the  above  case  studies  and  Section  3,  it  can  be  concluded  that  a  second  analysis,  even 
though  using  a  subset  of  observations,  was  nearly  as  "good"  (approximately  the  same)  as  the  original 
analysis,  obtained  from  interpolating  the  full  set  of  N  observations  (Eq.  1).  Too  many  observations, 
most  with  weights  so  small  they  apparently  introduced  noise  rather  than  information,  did  little  to 
improve  an  analysis.  On  the  other  hand,  carefully  selected  observations  allowed  OI  to  perform  more 
efficiently  but  as  effectively.  The  selection  technique  of  Section  5.1,  however,  obviously  would  be 
too  cumbersome  in  practice  to  be  of  value  even  though  it  worked  in  theory.  Clearly,  it  makes  no 
sense  to  run  a  complete  analysis,  which  requires  the  inversion  of  the  large  matrix  in  Eq.  2, 
for  the  sole  purpose  of  identifying  a  subset  of  the  "best"  observations  to  influence  an 
analysis  point.  Sections  3.2  and  3.3  determined  a  better  method,  forward  stepwise 
regression.  Consider  the  following  case  studies. 
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5.2.1  Sea  Level  Pressure  Analyses 


The  first  test  of  the  FSR  scheme  was  to  select  sea  level  pressure  observations.  Recall  from 
Section  3.2  that  the  efficiency  of  the  FSR  scheme  is  regulated  by  £.  With  £  =  0.001,  0.005,  and  0.01 
in  three  different  experiments  performed  in  both  data-sparse  and  data-rich  regions,  the  smaller 
values  of  £  served  to  force  more  observations  into  affecting  the  analysis.  When  £  <  0.01  the 
additional  observations  had  very  small  weights,  and  the  analysis  hardly  improved.  On  average, 
including  the  additional  observations  resulted  in  an  error  reduction  of  less  than  0.04  mb;  the 
maximum  reduction  was  less  than  0.1  mb. 

Table  8  shows  a  typical  example  of  the  effectiveness  of  the  FSR  technique.  The  frequency 
distribution  of  the  number  of  observations  required  for  an  analysis  is  highly  skewed  towards  fewer 
observations.  The  western  Eurasian  region  contained  208  observations  used  as  pseudo  gridpoints, 
about  each  of  which  an  OI  analysis  was  performed  using  observations  selected  by  the  FSR.  The 
results  from  the  FSR  are  compared  to  two  prior  analyses  made  at  the  same  points:  the  original 
analysis  (OI-l)  made  from  the  complete  set  of  those  20  observations  and  the  second  analysis  (OI-2) 
made  from  the  subset  of  observations  with  weights  W;  >  0.01.  Each  of  the  208  observation  points 
had  at  least  20  observations  within  500  km. 

The  FSR  technique  is  obviously  computationally  superior  to  the  OI-2  technique  because  OI-2 
first  required  an  inversion  of  a  20x20  matrix  followed  by  a  second  inversion  of  a  matrix  of  a  size 
shown  in  Table  8,  on  average  a  6x6  matrix.  On  the  other  hand,  the  FSR  requires  the  inversion  of 
many  small  matrices,  whose  maximum  size  is  shown  in  the  table  but  typically  is  only  a  3x3  matrix. 

Similarly,  selecting  observations  by  the  FSR  for  an  OI  analysis  is  better  than  using  the 
20  observations  in  the  OI-l  analysis.  On  average  the  FSR  needed  to  select  only  three  observations 
for  an  excellent  OI  analysis.  Table  9  shows  two  representative  measures  of  error  in  an  analysis  of 
the  surface  pressure  over  western  Eurasia,  where  the  OI  original  analysis  made  use  of  the  20  closest 
observations  (OI-l),  which  were  not  rigorously  selected.  For  this  case  the  average  first-guess  error 
(forecast  minus  observed  pressure)  is  -7.49  mb,  and  the  rms  difference  between  the  forecast  and 
observation  is  8.23  mb. 


35 


Table  8.  Frequency  distribution  of  the  occurrence  of  the  given  number  of  observations  used  in  an 
OI  analysis  at  observation  points  in  a  data-rich  region.  The  OI-2  is  the  second  analysis 
at  a  point  that  resulted  when  only  observations  with  W;  >  0.01  were  selected  from  a  set 
of  the  20  closest  observations,  which  influenced  the  original  analysis  OI-l  at  the  point. 
The  FSR  is  the  analysis  that  resulted  from  optimum  interpolation  of  observations  selected 
by  the  forward  stepwise  regression. 
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Table  9.  Comparison  of  analysis  errors  in  millibars.  The  OI-l  is  the  case  where  the  observations 
are  not  rigorously  selected,  that  is,  up  to  20  observation  within  500 1cm  are  used.  The  OI-2 
and  FSR  are  described  in  Table  8. 


OI-l 

OI-2 

FSR 

Average  of  Observed 

Minus  Analyzed  Pressure 

-0.34 

-0.12 

-0.34 

RMS  Difference  Between  the 

Observation  and  the  Analysis 

1.00 

0.96 

1.00 

Clearly,  the  OI  results  in  an  excellent  analysis,  with  an  average  error  reduction  of  94  percent 
and  an  rms  error  reduction  of  88  percent.  But  the  FSR  is  remarkable  because  it  needs  so  few 
observations  to  make  the  OI  effective.  The  point  is  that  these  observations  are  objectively  selected 
by  stepwise  regression.  Note  that  the  reduction  of  average  error  and  rms  error  is  only  slightly 
smaller  for  the  OI-2  case.  Also,  the  FSR  selects  on  average  three  observations  whose  OI  yields  an 
analysis  virtually  the  same  as  one  produced  from  the  full  set  of  20  observations. 

Similar  results  for  surface  pressure  analyses  occur  over  a  data-sparse  region,  in  this  case  a 
region  extending  from  northern  South  America  to  the  southern  United  States.  Table  10  shows  that 
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the  number  of  observations  affecting  an  analysis  (that  is,  the  number  of  observations  within  500  km 
of  each  of  201  observation  points)  for  this  region  has  several  peaks  spread  across  the  spectrum. 


Only  25  percent  of  the  points  have  a  complete  set  of  20  observations  within  500  km.  (Earlier 
experiments  showed  that  extending  the  radius  of  influence  from  500  km  to  750  km  had  little  effect 
on  the  analysis.)  Note  that  only  slightly  more  than  two  observations  on  average  are  selected  by  the 
FSR  for  an  OI  analysis  compared  to  (a)  an  average  of  nearly  12  observations  in  the  OI-l  group,  and 
(b)  the  slightly  more  than  four  observations  (with  Wi  2:  0.01)  selected  from  the  OI-l  group  for  use 
in  the  OI-2  analysis. 

But,  as  shown  in  Table  11,  the  results  are  similar,  even  in  a  data-sparse  region  when  the  first 
guess  itself  was  a  good  analysis.  For  this  case  the  average  first  guess  error  (forecast  minus  observed 
pressure)  is  only  0.59  mb,  and  the  rms  difference  between  the  forecast  and  observation 
is  only  2.42  mb. 

Clearly,  the  FSR  is  an  outstanding  technique  for  use  in  practice  because  it  is  both  so 
computationally  efficient  and  accurate.  In  this  case,  however,  the  first  guess  was  very  good;  so  OI 
could  reduce  the  average  error  only  by  60  percent  and  the  rms  error  by  only  18  percent. 
Nevertheless,  the  FSR  scheme  accomplished  the  reduction  by  using  only  a  few  observations,  rather 
than  attempting  to  squeeze  information,  which  does  not  exist  in  a  least  squares  sense,  from  a  host 
of  superfluous  observations. 


Table  10.  Frequency  distribution  of  the  number  of  observations  used  in  the  optimum  interpolation 
analysis  to  interpolate  a  value  to  a  point  in  a  data-sparse  region.  The  OI-l,  OI-2,  and 
FSR  have  the  same  meaning  as  in  Table  9. 
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Table  11.  Comparison  of  analysis  errors  (in  millibars)  at  observations  points  used  in  an  OI  analysis 
in  a  data?  region.  The  OI-l,  01-2,  and  FSR  are  the  same  as  in  Tables  8  and  9. 

OI-l 

OI-2 

FSR 

Average  of  Observed 

Minus  Analyzed  Pressure 

0.24 

0.24 

0.28 

RMS  Difference  Between  the 
Observation  and  the  Analysis 

2.10 

2.10 

2.15 

5.2.2  500  mb  Height  Analyses 

With  500-mb  heights  replacing  surface  pressure,  similar  experiments  would  be  generally 
expected  to  yield  similar  results.  Since  the  upper  air  data  is  sparse  compared  to  surface  data, 
however,  the  radius  of  influence  of  the  correlation  function  was  extended  to  1,000  km  to  allow  more 
potential  observations  to  influence  the  analysis. 

This  case  is  a  study  of  180  observations  over  Eurasia  on  9  January  1989  at  1200  UTC.  The 
FSR  scheme  was  as  effective  at  500  mb  as  it  was  at  the  surface.  On  average  the  FSR  used  only 
17  percent  of  the  available  observations  surrounding  each  pseudo  gridpoint;  nevertheless,  it  produced 
an  analysis  similar  to  one  obtained  by  using  all  the  observations  within  1,000  km.  Specifically,  the 
rms  error  of  the  FSR  analysis  was  76  percent  of  the  error  in  the  first  guess;  the  rms  error  of  the  OI-l 
analysis  (all  observations  within  1,000  km  are  used)  was  74  percent  of  the  error  in  the  first  guess. 

Table  12  shows  the  wide  range  of  the  number  of  observations  in  the  OI-l  group  and  the 
effect  of  the  FSR  on  how  many  of  those  observations  are  needed  for  an  analysis.  It  is  impressive  that 
on  average  about  three  observations  around  a  gridpoint  can  make  an  analysis  nearly  as  good  as 
about  17  observations. 

As  shown  in  Table  13,  the  results  arc  excellent,  even  though  the  average  first  guess  error 
(forecast  -  observed)  is  only  -12.4  m  and  the  rms  difference  is  28.7  m.  The  FSR  improves  the 
analysis  considerably.  The  average  error  is  reduced  by  almost  two-thirds  and  the  rms  error  is 
reduced  by  about  one-quarter. 
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Table  12.  Frequency  distribution  of  the  number  of  observations  at  500  mb  used  in  an  Ol  analysis 
of  heights  to  interpolate  a  value  to  a  point.  The  OI-l  and  FSR  are  tl  e  same  as  in 
Tables  8  and  9. 
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Table  13.  Comparison  of  500-mb  height  analysis  errors  in  meters.  The  OI-l  and  FSR  are  the  same 
as  in  Tables  8  and  9. 


OI-l 

FSR 

Average  of  Observed 

Minus  Analyzed  Height 

-3.2 

-4.3 

RMS  Difference  Between  the 

Observation  and  the  Analysis 

21.2 

21.9 

A  close  study  of  the  detailed  output  of  the  experiments  revealed  some  interesting  results.  At 
slightly  more  than  20  percent  of  the  points  the  FSR  yielded  a  better  analysis  than  OI-l,  even  though 
the  FSR  used  much  fewer  observations.  At  almost  6  percent  of  points  the  first  guess  was  better  than 
the  analyses,  but  the  errors  were  very  small.  The  maximum  difference  between  the  errors  resulting 
from  FSR  and  OI-l  was  less  than  10  m.  Taking  all  the  results  together,  the  FSR  reduced  the 
average  rms  error  by  30  percent,  and  OI-l  reduced  the  average  rms  error  by  32  percent.  Both 
schemes  reduced  the  average  error  by  40  percent. 

In  sum,  optimum  interpolation  performed  an  excellent  analysis,  as  anticipated.  Forward 
stepwise  regression  made  the  general  OI  scheme  even  more  efficient. 
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5.3  ANALYSES  ON  THE  RAP  GRID 


Next,  analyses  were  performed  at  actual  gridpoints  rather  than  the  pseudo  grid  (that  is, 
observation)  points,  as  in  the  preceding  experiments.  Each  gridpoint  and  up  to  20  of  the  closest 
observations  within  500  km  (or  some  other  given  radius)  of  this  point  made  up  a  set  of  observations. 
(Recall  that  this  analysis  is  called  Ol-l.)  From  this  set  the  simulated  correlation  coefficients  were 
calculated  (from  the  model  r$  =  e* a)  as  a  function  of  the  distance  between  observations  and  the 
gridpoint,  and  the  intercorrelations  between  each  of  the  observation  points  were  calculated  as  a 
function  of  the  distance  between  the  observations. 

Several  numerical  experiments  provided  both  objective  and  subjective  results  that 
demonstrated  the  success  of  the  transformation  of  the  OI  scheme.  The  baseline  test  is  a  comparison 
of  two  analyses,  one  at  observation  points  and  the  other  at  gridpoints,  using  the  same  first  guess  and 
observation  set.  In  all  cases  both  analyses  were  similar.  Tie  second  test  compares  an  analysis 
calculated  from  the  OI  using  a  set  of  up  to  20  of  the  closest  observations  (OI-l)  with  an  analysis 
calculated  from  a  subset  of  those  observations  chosen  by  the  FSR  process.  This  test  confirmed  the 
technical  approach  of  integrating  stepwise  regression  into  optimal  interpolation.  All  meteorological 
variables  were  analyzed. 

The  objective  "measure  of  merit"  is  a  comparison  of  the  errors  (the  difference  between  the 
value  of  an  element  at  an  observation  point  and  the  value  obtained  by  interpolating  the  element  at 
surrounding  gridpoints  to  the  observation  point).  This  "grid-to-station"  verification  is  not  a  perfect 
measure  of  merit;  nevertheless,  it  is  useful. 

5.3.1  Sea  Level  Pressure  Analyses 

First,  however,  note  from  Table  14  that  only  a  relatively  few  observation  of  sea  level  pressure 
are  needed  to  produce  an  accurate  analysis  in  a  data-rich  region.  Of  course,  the  FSR  technique  is 
computationally  superior  to  the  OI-l  technique,  which  required  an  inversion  of  a  large  (on  average 
an  18x18  in  this  case)  matrix.  On  the  other  hand,  the  FSR  required  the  inversion  of  many  small 
matrices,  usually  a  3x3  matrix,  as  shown  in  Table  8.  On  the  41x41  RAP  grid  (1,681  gridpoints),  the 
reduced  number  of  calculations  needed  for  the  FSR  scheme  is  obviously  substantial. 
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Table  14.  Frequency  distribution  of  the  number  of  sea  level  pressure  observations  used  in  an  OI 
analysis  to  interpolate  a  value  to  a  point  in  a  data-rich  region  on  the  RAP  grid  system. 
The  OI-l  and  FSR  are  described  in  Tables  8  and  9. 


Number  of  observations 

1  2  3  4  5  6-10  11-15  16-20 

OI-l  0  0  0  0  0  31  329  1,321 

FSR  240  660  622  148  11  0  0  0 


The  analyses  were  accurate,  as  measured  by  the  reduction  of  average  and  root  mean 
differences;  however,  plots  of  the  sea  level  pressure  appeared  to  be  "noisy."  Note  from  Eq.  1  that 
an  OI  analysis  results  from  adding  the  First  guess  field  to  the  sum  of  the  weighted  departures 
(I;  Wj’Dj).  These  corrections  are  slightly  noisy  when  viewed  as  analyzed  plots  of  the  departure  fields 
at  gridpoints.  Also,  lagged  autocorrelations  and  Fourier  analyses  of  the  fields  suggest  there  is  some 
two-grid-interval  variations  (noise).  These  small-scale  variations  were  removed,  without  harming  the 
overall  analysis,  by  applying  Shapiro’s  seven  point  linear  smoothing  filter  (Shapiro,  1975).  It 
suppresses  two-grid-interval  waves  without  changing  the  phase  of  any  wave  component  and  with  little 
damping  of  the  amplitudes  of  all  other  waves. 

5.3.2  500-mb  Height  Analyses 

Statistics  of  analyses  of  500-mb  height  surfaces  performed  at  pseudo  gridpoints  are  similar 
to  those  performed  on  a  grid.  On  average  the  OI  with  the  FSR  selecting  the  observations  required 
only  three  observations  to  make  an  analysis  nearly  identical  to  an  OI  analysis  that  used  an  average 
of  18  observations  surrounding  each  gridpoint.  Overlaid  plots  of  the  two  500-mb  analyses  revealed 
only  the  slightest  of  differences  between  the  two,  and  those  differences  were  isolated.  Table  15 
shows  two  measures  of  merit  in  an  analysis  of  the  500-mb  height  field  over  western  Eurasia.  For  this 
case  with  a  good,  unbiased  first  guess  the  average  error  (forecast  minus  observed)  is  only  10  m,  and 
the  rms  difference  between  the  forecast  and  observation  is  approximately  24  m.  The  average  rms 
difference  between  the  OI-l  and  FSR  (not  shown)  is  2.8  m,  or  12  percent  of  the  first  guess  error. 

For  comparison  Table  16  shows  the  results  of  analyses  that  use  the  same  first  guess  and 
observation  set  but  are  calculated  at  observation  points  instead  of  gridpoints.  In  this  case  the 
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Table  15.  The  500-mb  differences  (FIRST  GUESS  analysis)  in  meters,  using  the  OI  on  a  50-nm 
grid  over  data-rich  Eurasia.  The  OI-l  and  FSR  are  described  in  Tables  8  and  9. 


OI-l 

FSR 

Average  Value 

9.97 

9.25 

Root-Mean-Square 

24.2 

23.0 

Table  16.  The  500-mb  analysis  errors  (m)  using  the  OI  at  the  observation  points  over  the  Eurasian 
region.  FIRST  GUESS  is  the  RWFM  (preliminary)  forecast  field  interpolated  to  the 
observation  points. 

FIRST  GUESS 

OI-l 

FSR 

Average  of  Observed 

Minus  Analyzed 

-15.9 

-1.18 

-2.15 

RMS  Difference  Between  the 
Observation  and  the  Analysis 

34.7 

6.2 

6.9 

average  first  guess  error  was  nearly  -16  m  as  compared  to  analysis  errors  of  only  about  2  m.  The 
first  guess  r ms  errors  (that  is,  the  difference  between  the  first  guess  and  the  observed  500-mb  height) 
was  about  35  m  compared  to  analyzed  rms  errors  less  than  7  m. 

Clearly,  the  OI  results  in  an  excellent  analysis,  with  an  average  error  reduction  of  86  percent 
for  the  FSR  and  92  percent  for  the  OI-l,  and  an  rms  error  reduction  of  about  80  percent  for  both. 
Again,  the  FSR  needs  so  few  observations  to  make  the  OI  effective  because  these  observations  are 
objectively  selected  by  stepwise  regression.  Note  that  the  FSR  reduction  of  average  error  and  rms 
error  is  only  slightly  larger  than  the  OI-l  case,  even  though  the  FSR  selects  on  average  only  three 
observations  from  the  full  set  of  up  to  20. 

5.3.3  The  500-mb  Temperature  Analyses 

Similar  results  for  500-mb  temperature  analyses  occur  over  a  data-sparse  region,  in  this  case 
the  region  extending  from  northern  South  America  to  the  southern  United  States.  (As  shown, 
however,  the  distribution  of  observations  is  not  like  that  of  Table  13.)  In  fact,  a  composite  summary 
of  several  analyses  of  different  variables  in  the  two  regions  yields  the  same  conclusions  as  the 


42 


individual  case  illustrated  in  Tables  9,  10,  and  11.  The  rms  errors  of  the  analysis  are  typically  only 
20  percent  of  the  rms  errors  of  the  first  guess.  The  FSR  and  OI-l  analyses  are  remarkably  similar 
as  well  as  a  great  improvement  over  the  first  guess.  The  autocorrelation  of  both  analyses  is  typically 
0.99  for  a  data-rich  region  and  0.97  for  a  data-sparse  region.  Of  course,  the  amount  of  improvement 
depends  upon  the  quality  of  the  first  guess  forecast. 

5.3.4  The  500-mb  Humidity  Analyses 

The  excellent  performance  of  the  stepwise  regression  scheme  in  the  optimum  interpolation 
of  pressure,  both  on  the  surface  and  on  isobaric  levels,  and  temperature  leads  to  the  conclusion  that 
using  a  set  of  carefully  selected  observations  is  preferable  to  using  all  observations  in  some  arbitrary 
vicinity.  The  next  test  of  the  OI  scheme  was  to  analyze  the  more  difficult  meteorological  variables, 
of  humidity  and  wind. 

In  spite  of  poor  first  guess  forecasts  of  moisture,  the  OI  scheme  produced  good  analyses  of 
humidity  (using  the  method  of  Redder  and  Fukuta  (1989)  to  convert  from  dewpoint  to  standard 
humidity  variables),  even  though  the  scheme  has  a  simple  univariate  correlation  function.  The  rms 
errors  and  average  errors  were  similar  to  those  of  other  variables. 

The  objective  test  continues  to  compare  an  analysis  calculated  from  the  OI  using  a  set  of  up 
to  20  of  the  closest  observations  (OI-l)  with  an  analysis  calculated  from  a  subset  of  those 
observations  chosen  by  the  FSR  process.  In  addition,  both  these  analyses  can  be  made  at  either 
gridpoints  or  observation  points,  which  serve  as  pseudo  gridpoints;  hence  it  is  straightforward  to 
compare  analyses  to  actual  observations  too.  This  test  confirmed  the  technical  approach  used  to 
integrate  stepwise  regression  into  optimal  interpolation,  and  with  a  simple  but  physically  realistic 
correlation  function.  The  objective  measure  of  merit  is  a  comparison  of  the  errors  (the  value  of  an 
element  at  an  observation  point  minus  the  value  obtained  by  interpolating  the  element  at 
surrounding  gridpoints  to  the  observation  point). 

Table  17  shows  the  average  errors  and  rms  errors  of  the  500  mb  relative  humidity  analysis 
and  confirms  the  similarity  between  analyses  using  an  average  of  17  observations  (OI-l)  and  the 
analysis  derived  from  optimum  interpolation  of  observations  selected  by  FSR.  These  two  analyses 
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Table  17.  The  difference  between  the  FIRST  GUESS  (a  36  hr  RWFM  forecast)  and  the  relative 
humidity  analysis  on  a  grid  over  Eurasia.  The  OI-l  is  an  analysis  computed  from  a 
complete  set  of  observations,  and  the  FSR  is  the  analysis  computed  using  forward 
stepwise  regression,  which  selects  a  subset  of  observations. 


OI-l 

FSR 

Average  Value  (%) 

0.125 

0.052 

Root-Mean-Square  (%) 

15.9 

15.67 

over  western  Eurasia  are  clearly  similar,  but  they  need  to  be  complemented  by  a  measure  of  merit 
that  compares  errors  at  observation  points. 

Analyzing  the  errors  at  the  observation  points  themselves,  rather  than  at  the  gridpoints  as 
in  Table  17,  gives  a  better  measure  of  accuracy  of  the  scheme.  Table  18  shows  that  the  scheme  is 
successful;  the  rms  errors  are  reduced  by  83  percent.  In  addition,  the  correlation  between  the 
analyzed  humidity  field  interpolated  to  the  observation  points  and  the  humidity  measured  at  the 
observation  points  is  0.97  (the  corresponding  first  guess  correlation  is  0.41). 

Thus,  even  with  a  poor  first  guess,  the  OI  scheme  produces  an  excellent  humidity  analysis. 
Note  that  the  FSR  reduction  of  average  error  and  rms  error  is  only  slightly  smaller  than  the  OI-l 
case,  even  though  the  FSR  selects  on  average  only  three  observations  from  the  full  set 
of  up  to  20  observations  nearest  the  gridpoint. 

Table  19  shows  the  various  correlations  among  the  analyzed  humidity'  fields  with  each  other 
and  with  actual  observations.  The  analyzed  field  is  very  highly  correlated  with  the  observed  humidity. 
Also,  the  analysis  derived  from  stepwise  regression,  which  uses  on  average  three  observations,  is  very 
highly  correlated  with  the  analysis  derived  from  an  average  of  17  of  the  closest  observations. 

5.3.5  The  500-mb  Wind  Analyses 

Finally,  the  OI  scheme  is  also  successful  when  analyzing  wind  fields.  Table  20  shows  the 
average  errors  and  rms  errors  of  the  v  component  of  wind  and  confirms  the  similarity  between 
analyses  using  an  average  of  17  OI-l  and  the  analysis  derived  from  the  FSR  scheme. 
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Table  18.  The  500-mb  humidity  analysis  errors  using  the  OI  at  the  observation  points  over  the 
Eurasian  region.  The  OI-l  and  FSR  are  defined  in  Table  17,  and  FIRST  GUESS  is  the 
forecast  error. 


FIRST  GUESS  OI-l  FSR 


Average  of  Observed 

Minus  Analyzed  (%) 

1.48 

-0.02 

0.17 

RMS  difference  between 

Observation  and  the  Analyses 

26.6 

4.61 

4.6 

Table  19.  The  average  correlations  among  relative  humidity  analyses  derived  first  from  optimum 
interpolation  of  up  to  20  of  the  closest  observations  (OI-l),  observations  and  second 
from  optimum  intcrpolatiion  of  observations  selected  by  (FSR).  These  analyses  are  also 
correlated  with  the  FIRST  GUESS  (36-hr  forecast)  and  observed  value  (OBVAL). 


OI-l 

FSR 

OBVAL 

FIRST  GUESS 

OI-l 

1 

0.988 

0.971 

0.543 

FSR 

0.988 

1 

0.969 

0.543 

OBVAL 

0.971 

0.969 

1 

0.414 

FIRST  GUESS 

0.543 

0.543 

0.414 

1 

Table  20.  The  difference  between  the  FIRST  GUESS  (36-hr  RWFM  forecast)  and  the  v-wind 
component  analysis  on  a  grid  over  Eurasia.  The  OI-l  and  FSR  are  defined  in  Table  17. 


OI-! 

FSR 

Average  Value  (m/s) 

1 

0.91 

Root  Mean  Square  (m/s) 

3.77 

3.65 

The  accuracy  of  the  analysis  was  measured  by  comparing  an  analysis  made  at  gridpoints  and 
interpolated  to  observation  points  to  the  observed  value  at  that  point.  Table  21  shows  that  analyzed 
wind  errors,  when  compared  to  the  first  guess  errors,  were  reduced  by  83  percent,  which  is  a  typical 
error  reduction  for  the  scheme. 

Table  22  shows  that  the  analyzed  wind  field  is  very  highly  correlated  with  the  observed  wind 
field.  Also,  the  analysis  derived  from  stepwise  regression,  which  uses  on  average  three  observations, 
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Table  21.  The  500  mb  v-wind  component  analysis  errors  (m/s)  using  the  OI  at  the  observation 
points  over  the  Eurasian  region.  FIRST  GUESS  is  defined  in  Table  18  and  OI-l  and 
FSR  are  defined  in  Table  17. 


FIRST  GUESS 

OI-l 

FSR 

Average  of  Observed 

Minus  Analyzed  (m/s) 

-1.36 

-0.1 

-0.17 

RMS  Difference  (m/s) 

Between  the  Observation 

6.02 

0.93 

0.99 

and  the  Analysis 

Table  22.  The  average  correlations  among  analyses  of  the  v-wind  component  derived  first  from 
optimum  interpolation  of  up  to  20  of  the  closest  observations  (OI-l)  and  second  from 
optimum  interpolation  of  observations  selected  by  FSR.  These  analyses  are  also 
correlated  with  the  FIRST  GUESS  (36-hr  forecast)  and  OBVAL. 


OI-l 

FSR 

OBVAL 

FIRST  GUESS 

OI-l 

1 

0.9899 

0.9879 

0.8935 

FSR 

0.9899 

1 

0.9876 

0.8935 

OBVAL 

0.9879 

0.9876 

1 

0.8720 

FIRST  GUESS 

0.8935 

0.8935 

0.8720 

1 

is  very  highly  correlated  with  the  analysis  derived  from  an  average  of  17  of  the  closest 
observations. 


The  results  of  experiments  with  the  u-component  of  wind  are  virtually  identical.  Therefore, 
they  offer  no  further  insight. 


6.  STATUS  AND  PLANS 

STC  Task  1,  development  of  a  RAP,  is  nearly  completed.  The  OI  scheme,  using  univariate 
correlations  and  stepwise  regression,  has  successfully  analyzed  all  meteorological  variables.  All  that 
remains  is  to  test  the  scheme  with  multivariate  correlations  to  determine  if  a  better  analysis  is 
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possible.  Also,  the  scheme  will  be  tested  to  ensure  that  neglecting  observation  errors  is  a  valid 
assumption  in  practice. 

STC  Task  2,  calculation  and  modeling  of  the  first-guess  forecast  errors  and  the  OSSEs,  are 
now  the  main  focus  of  attention.  All  required  databases  have  been  prepared  to  support  the  error 
correlation  module,  described  in  Section  4.5.  The  RAP  will  test  the  validity  of  the  error  models  on 
independent  data  sets. 

STC  Task  3,  the  RAP  objective  verification  program,  has  been  informally  underway  for 
several  months  but  on  a  low  priority  basis.  It  will,  however,  be  ready  for  operational  use  before 
Task  2  requires  it.  Appendix  B  is  a  detailed  report  on  a  comparison  of  the  GSM  and  RWFM  with 
the  verifying  analyses  from  HIRAS.  The  comparison  was  completed  early  to  ensure  that  the  forecast 
database,  which  consists  of  12  nearly  independent  summer  and  winter  forecasts,  contained  only 
"good"  data  because  our  sample  is  too  small  to  allow  the  calculations  of  error  correlations  to  be 
overwhelmed  by  a  bad  forecast. 

STC  Task  4,  the  OSSEs,  began  in  December  1990.  The  observation  database  is  complete, 
and  the  analysis  and  forecast  databases  are  being  developed.  The  subtasks  detailed  in  Section  2.4 
will  be  accomplished  by  summer  1992. 

STC  Task  5,  extends  the  analysis  prepared  as  a  result  of  the  RAP  into  a  12-hr  forecast.  This 
portion  of  the  project  will  be  started  in  spring  1992. 


7.  SUMMARY  AND  CONCLUSIONS 

The  RAP  represents  a  unique  technical  approach  to  a  regional  analysis.  No  other  group 
integrates  the  FSR  into  optimum  interpolation  to  select  observations  rigorously  for  input  into  an 
optimum  interpolation  scheme.  In  addition,  the  scheme  as  developed  in  Eq.  2  can  ignore  the 
observation  error,  which  is  an  unknown  quantity,  at  gridpoints.  This  is  not  the  typical  approach  to 
optimum  (or,  strictly  speaking,  statistical)  interpolation.  The  optimum  interpolation  analysis  scheme, 
even  when  using  a  relatively  simple,  univariate  correlation  model,  performs  accurate  analyses  of  any 
meteorological  variable. 
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APPENDIX  A 


SOFTWARE  REPORT 

This  software  report  provides  a  brief  description  of  problems  encountered  with  government 
furnished  data  and  operational  software,  a  detailed  listing  of  the  required  software  modifications,  and 
a  short  summary  of  the  software  developed  for  the  regional  analysis  procedure  (RAP). 


1.  PROBLEMS 

A  DATA  EXTRACTION  FROM  "FOREIGN"  MAGNETIC  TAPES 

The  datasets  of  analyses  and  observations  from  the  U.S.  Air  Force  Environmental  Technology 
Applications  Center  (USAFETAC)  were  not  user-friendly,  especially  the  observations  (surface,  upper 
air,  and  satellite-measured  temperatures).  While  the  High-Resolution  Analysis  Model  (HIRAS) 
analyses  are  fixed-length  records  in  ASCII,  the  observations  are  variable  length,  unformatted  records. 
The  VAX  FORTRAN  language  and  utility  programs  at  the  Phillips  Laboratory  Geophysics  (PL/GL) 
Computer  Center  are  not  well  suited  to  read  tapes  generated  on  non- VAX  hardware  (that  is,  foreign 
tapes)  in  any  event;  variable  length,  unformatted  binary  records  present  even  more  problems. 
Nevertheless,  Science  and  Technology  Corporation  (STC)  developed  methods  to  decode  and  process 
DATSAV2  surface  data  and  DATSAV  upper  air  data,  both  of  which  are  in  binary  format. 

A  VAX  consultant  from  the  PL/GL  Computer  Center  provided  a  FORTRAN  subroutine, 
QIO_READ,  which  contained  VAX  system  subroutines  for  extracting  data  from  foreign  tapes. 
These  system  routines,  however,  produced  meteorological  nonsense  when  used  to  extract  upper  air 
data  from  binary  tapes.  After  examining  the  binary  data  from  tape  data  dumps  and  converting  the 
contents  of  each  byte  from  binary  to  decimal  format,  it  was  apparent  that  reversing  the  order  of  the 
bytes  yielded  meteorological  information.  The  IBM  hardware,  which  USAFETAC  uses,  processes 
each  byte  within  half  or  whole  words  in  reverse  order  during  data  transfers  to  and  from  tapes. 
Additional  programs  were  modified  into  STC  subroutines  FLIPHWORD  and  FLIPWWORD,  to 
properly  process  each  half  and  whole  word  integer  produced  by  IBM  hardware. 
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B.  DEBUGGING  THE  RELOCATABLE  WINDOW  ANALYSIS  MODEL  (RWFM) 
FORECAST  ERROR 

A  very  large  RWFM  forecast  error  was  caused  by  an  error  in  the  U.S.  Air  Force  Global 
Weather  Cental  (AFGWC)  software.  The  initial  analysis  and  forecasts  at  70  and  50  mb  were  much 
too  warm;  the  height  field  was  more  than  5,000  m  above  standard!  The  other  variables  at  those 
levels  and  all  variables  at  the  100-mb  level  and  below,  however,  verified  well. 

The  cause  of  the  forecast  error  was  located  in  Subprogram  RWPOST,  the  post-processing 
program  of  the  RWFM  consisting  of  several  thousand  lines  of  code  (reference  the  AFGWC  RWFM 
package/Subprogram  RWPOST).  Exhaustive  checking  of  each  module  and  debugging  of  those  that 
could  have  caused  the  error  isolated  the  problem  in  a  section  of  code  in  Subroutine  CALC01,  which 
(for  no  apparent  reason)  recalculates  the  temperature  at  the  top  sigma  level.  Not  only  was  the 
recalculation  unnecessary,  it  was  incorrect. 

STC  called  AFGWC  to  advise  them  of  the  error  and  to  eliminate  it  by  removing  the  three 
(illogical)  lines  of  code  that  recalculated  the  variables  TSUM,  PENV,  and  TS.  (The  code  was 
located  immediately  above  the  statement,  90  CONTINUE. 

C.  GRID  CONVERSION 

The  conversion  from  the  RWFM  grid  to  the  uniform  gridded  data  field  (UGDF)  grid  was 
a  deceptively  "simple"  task.  It  was  expected  to  be  simple  because  AFGWC  had  developed  the 
software  and  prepared  the  documentation  more  than  a  year  before  sending  it  to  STC;  the  simplicity 
of  the  task  was  deceptive,  however,  because  some  of  the  software  and  documentation  had  errors. 
The  errors  are  documented  thoroughly  for  AFGWC’s  information. 

The  requirement  was  to  interpolate  fields  from  the  RWFM  grid  on  a  Lambert  conformal  (or 
Mercator)  projection  to  the  UGDF  on  a  polar  stereographic  (or  Mercator)  projection.  STC  had 
followed  AFGWC’s  lead  by  running  the  RWFM  on  the  Lambert  conformal  projection  for  a 
midlatitude  window.  This  was  a  sensible  choice  that  nevertheless  caused  difficulties  when  converting 
to  the  UGDF  grid. 


A-2 


When  the  latest  version  of  software  and  documentation  arrived  from  AFGWC,  STC  learned 
for  the  first  time  that  the  UGDF  grid  was  defined  only  for  polar  stereographic  and  Mercator 
projections.  Unfortunately,  no  code  for  interpolating  a  grid  system  on  a  Lambert  conformal 
projection  to  a  UGDF  grid  was  provided.  This  problem  became  more  complicated  because  the 
software  was  neither  thoroughly  tested  (there  were  some  obvious  errors,  such  as  a  grid  length  that 
changed  across  the  Greenwich  Meridian,  that  testing  would  have  revealed)  nor  well  documented 
(AFGWC  could  not  answer  questions  about  input  required  by  their  software  because  the 
programmers  were  unavailable).  After  finding  additional  obscure  errors  in  both  the  software  and 
the  documentation,  STC  determined  that  developing  original  software  was  the  best  solution  to  the 
problem.  The  errors,  however,  are  documented  below. 

1.  Reference  the  AFGWC/TN  -  79/003  (REV),  MAP  PROJECTIONS  AND  GRID 
SYSTEMS  FOR  METEOROLOGICAL  APPLICATIONS. 

a.  Equation  2.11  (on  page  17  in  the  Tech  Note)  describes  the  y-axis  in  an  image 
plane  coordinate  system  for  a  Mercator  projection.  This  equation,  however,  is  a 
specific  version  of  the  more  general  formula  incorporated  into  the  RWFM 
preprocessing  software,  and  is  valid  only  if  the  center  point  (reference)  latitude 
(PHIO)  is0°. 

The  Tech  Note,  however,  does  not  refer  to  the  generalized  equation  used  in  the 
RWFM  software  nor  does  it  address  the  limitations  of  Eq.  2.11.  Furthermore,  the 
software  documentation  does  not  discuss  the  transformation  equation,  which  can 
only  be  examined  upon  detailed  inspection  of  the  source  code.  This  incomplete 
documentation  led  to  erroneous  software  based  on  the  assumption  that  Eq.  2.11 
was  used  in  the  AFGWC  code. 

In  the  derivation  leading  to  the  general  form  of  Eq.  2.11,  reference  was  made  to 
code  in  Subroutine  MERC,  which  sets  up  a  Mercator  grid.  MERC  calculates  the 
latitude  and  longitude  for  each  gridpoint.  A  rearrangement  of  terms  in  the  section 
of  code  that  calculates  the  latitude  (contained  within  the  DO  130  loop)  yields  the 
following  expression: 

PHI  =  [  2.0*ATAN(EXP(Y/ECOS))  -2.0*PI/4  ]  +  PHIO 
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where  PHI  is  the  latitude  of  the  gridpoint,  ECOS  is  the  length  of  cosine  side  of 
earth  radius,  and  Y  is  the  y-coordinate  on  the  grid. 

This  expression  for  PHI  follows  from  the  code 

DEGLAT(U)  =  PHI2  +  CENLTD 

after  substituting  for  PHI1  and  PHI2,  and  defining  PHIO  =  CENLTD.  From 
several  algebraic  manipulations,  an  expression  for  the  y-coordinate  can  be  solved: 

Y  =  ECOS  *  In  [  TAN  1/2(PHI-PHI0  +  PI/2)  ]. 

Now  ECOS  =  (  A  *  COS[true  latitude]  ),  where  A  is  the  Earth’s  radius,  in 
Subroutine  MERC.  Substituting  the  value  of  ECOS  into  the  above  expression  for 
Y  allows  a  direct  comparison  with  Equation  2.11  of  the  Tech  Note, 

Y  =  (A*COS(PHIl))  *  ln[  TAN  1/2(PHI  +  PI/2)  ], 

where  PHI1  is  the  true  latitude. 

Clearly,  the  expressions  for  the  y-coordinate  in  Subroutine  MERC  and  Eq.  2.11 
in  the  Tech  Note  are  equivalent  if  and  only  if  PHI  =  (PHI-PHIO),  that  is,  when 
PHIO  =  0,  which  means  that  the  center  point  latitude  for  the  Mercator  projection 
must  be  the  Equator.  Nothing  in  the  documentation,  however,  required  this 
limitation. 

b.  Equation  3.43  (on  page  68  of  the  AFGWC  Tech  Note)  calculates  the  latitude 
(PHI)  of  any  gridpoint,  given  its  I  and  J  coordinates.  This  equation  applies  only  to 
a  whole  mesh  grid,  however,  because  (Al),  the  longitudinal  grid  spacing  in  degrees 
is  not  multiplied  by  the  grid  scale  factor  (G)  as  it  is  in  Eq.  3.44.  Therefore.  Eq. 
3.43  yields  latitudes  much  too  large  due  to  the  larger  number  resulting  from  the 
calculation  of  (JE- J).  To  correct  Eq.  3.43,  A4  must  be  multiplied  by  the  mesh 
scale  factor  G  (for  example,  G  =  +0.25  for  a  quarter-mesh  grid  in  the  Northern 
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Hemisphere)  prior  to  taking  the  exponential.  The  correct  term  is,  then, 
exp  [  PI*  G  *  A/1/180  *  (JE-J)  ]. 

2.  Reference  AFGWC  RWFM  package/Subprogram  RWGRID.  STC  found  a  mistake 
in  the  mesh  generation  code  that  resulted  in  increased  forecast  errors.  For  the 
Mercator  projection  the  true  latitude  was  set  to  30°  from  the  center  point  latitude  in 
a  parameter  statement  in  Subroutine  SETGRD.  But  this  is  inconsistent  with 
Subroutine  PARAM  in  the  DO  80  loop,  which  is  calculating  map  factors  for  a  Mercator 
projection.  The  calculation  of  the  variable  QMAP  involves  a  "hard-wired"  constant 
value  of  0.92388.  This  happens  to  be  the  cosine  of  22.5°,  a  value  apparently  not 
changed  when  the  software  was  modified  to  set  the  true  latitude  to  30°  from  the  center 
point  latitude.  The  oversight  produced  an  erroneous  map  scaling  factor,  which  in  turn 
affected  the  RWFM  forecasts  generating  the  STC  database  (to  be  used  for  calculating 
error  correlations).  This  error  in  the  code  will  be  treated  as  a  forecast  error  for  RAP 
purposes. 


2.  SOFTWARE  DEVELOPMENT 

The  RAP  has  mostly  been  a  large  software  development  project,  consisting  of  five  major 
modules.  The  purpose  of  these  modules  is  to  read  the  data  from  the  magnetic  tapes  from 
USAFETAC,  to  build  the  forecast  and  observation  databases,  to  perform  numerical  analysis 
experiments,  to  calculate  forecast  error  correlations,  and  to  build  a  database  of  simulated  First 
GARP  [Global  Atmospheric  Research  Program]  Global  Experiment  (FGGE)-2b  observations  for 
the  OSSEs. 

A.  EXTRACTING  DATA  FROM  FOREIGN  MAGNETIC  TAPES  AND  BUILDING  THE  RAP 
FORECAST  AND  OBSERVATION  DATABASES 

In  addition  to  the  modifications  of  the  VAX-provided  software  discussed  in  Section  1A, 
programs  were  developed  to  read  HIRAS,  DATSAV  surface  observations,  and  DATSAV2  upper 
air  observations  (RAOB,  aircraft,  and  satellite). 
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Sorting  and  merging  ail  the  data  required  development  of  a  database  management  system 
that  was  reliable,  flexible,  and  user-friendly.  All  data  are  stored  on  the  PL/GP  Computer  Center’s 
Centralized  File  Storage  System  (CFSS)  for  efficient  retrieval  and  also  on  magnetic  tape  for  backup. 
In  contrast  to  storage  on  USAFETAC’s  tapes,  the  data  are  sorted  synoptically  to  facilitate  the 
calculation  of  forecast  errors. 

The  primary  programs  are:  BACKUP_CFSS.COM,  which  copies  files  from  CFSS  to 
magnetic  tape;  HIRASREAD.FOR,  which  read  HIRAS  from  the  magnetic  tapes  from 
USAFETAC/OL-A;  MSC_EDIT.FOR,  which  selects  all  observing  stations  within  a  window  defined 
by  the  program  SETGRID.FOR;  MSC_SF.FOR  and  MSC_UP.FOR,  which  create  the  global  master 
station  catalog  for  rawinsonde  and  surface  stations,  respectively,  SATWRITE.FOR,  which  combines 
satellite  data  from  several  tapes  onto  one  tape;  READ_ETAC.COM,  which  is  a  command  procedure 
for  running  the  tape-reading  jobs  in  batch  mode;  READ_VTAPE.COM,  which  copies  files  from 
magnetic  tape  to  CFSS;  AIRREAD.FOR,  which  reads  aircraft  data  from  USAFETAC  tapes; 
SURFDREAD.FOR,  which  reads  surface  data  from  USAFETAC  tapes;  and  UPPERREAD.FOR, 
which  reads  upper  air  data  from  USAFETAC  tapes. 

B.  OPTIMUM  INTERPOLATION 

The  optimum  interpolation  scheme,  based  on  objective  data  selection  by  forward  stepwise 
regression  (FSR),  has  been  carefully  tested  on  all  meteorological  variables.  This  includes  gross  error 
and  buddy  checks,  which  identify  observations  that  might  be  eliminated  from  an  analysis. 

The  primary  programs  are  BUDCHK.FOR,  which  is  described  in  great  detail  in  Section  3.5 
of  the  report:  EXP3.FOR,  which  tests  the  algorithm  that  selects  observations  by  forward  stepwise 
regression;  EXP4.FOR  (EXP5.FOR),  which  calculates  the  analyzed  value  at  an  observation  (grid) 
point  by  optimum  interpolation;  INTRP.FOR,  which  maps  from  the  RWFM’s  Lambert  Conformal 
or  Mercator  grid  to  the  UGDF  polar  stercographic  or  Mercator  grid,  respectively;  SETGRID.FOR, 
which  creates  a  grid  and  calculates  the  gridpoint  latitude  and  longitude  from  user-defined 
parameters:  central  point.resolution,  projection,  and  number  of  columns  and  rows;  and 
RAPLIB.FOR,  which  contains  a  library  of  subroutines  called  by  EXP3.F0R,  EXP4.FOR, 
EXP5.FOR,  and  others. 
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C.  ERROR  CORRELATION  CALCULATIONS 


Only  the  general  software  for  calculating  error  correlations  was  completed  as  of  the  writing 
of  this  report.  The  primary  programs  are  COR_BIN.FOR,  which  "bins"  all  pairs  of  stations 
according  to  direction  and  distance;  CORCALC.FOR,  which  calculates  the  horizontal  error 
correlation  coefficients  of  paired  meteorological  elements;  CORCALCl.FOR,  which  calculates  the 
correlation  coefficients  of  paired  meteorological  elements;  COR_CHDATAFOR,  which  checks  the 
observations  obtained  by  COR_CHDATAFOR  for  gross  errors;  COR_FGDATA,  which  reads  a 
forecast  file  and  calculates  a  first  guess  at  an  observation  point  by  using  a  grid  points-to-station 
interpolation;  COR_SORT.FOR.,  which  sorts  the  data  for  use  in  the  program  COR_CALC.FOR; 
COR_UPDATAFOR,  which  extracts  the  observed  values  from  a  global  synoptic  data  file  of  upper 
air  observation;  MSC_UP.COM,  which  creates  the  master  rawinsonde  station  catalog; 
MSC_SF.COM,  which  creates  the  master  surface  station  catalog;  and  MSC_EDIT.COM,  which 
provides  the  required  station  pairs. 

D.  Observing  System  Simulation  Experiments 

Less  software  has  been  requited  so  far  compared  to  the  above  modules  because  PL/GP 
obtained  packages  for  data  extraction  and  unpacking  of  the  T-106  forecasts  and  FGGE-2b 
observations.  In  addition,  the  PL/GP  made  the  Air  Force  Geophysical  Laboratory  (AFGL)  Statistical 
Analysis  Package  (ASA)P  software  available.  So  far  the  work  has  mostly  consisted  of  making 
modifications  to  the  provided  software  and  developing  scripts  for  the  UNICOS  on  the  CRAY-2  at 
Phillips  Laboratory  Supercomputer  Center. 

3.  STC’S  MODIFICATIONS  TO  TIIE  RWFM 

A.  PREPARING  DATABASE  INPUT 

STC  developed  a  post-processor  for  the  PL/GP  Global  Spectral  Model  (GSM)  database  files. 
The  program  performed  two  tasks.  It  changed  the  PL/GP  GSM  database  files,  dimensioned  into 
arrays  of  144*73,  and  made  them  145*73  arrays  by  setting  the  145th  column  equal  to  column  1.  In 
addition,  it  wrote  GSM  database  parameters  in  reverse  order;  that  is,  the  12th  pressure  level  was 


written  first,  and  the  first  pressure  level  was  written  last  to  match  the  format  the  RWFM  reads  in 
the  GSM  database. 

B.  MODIFICATIONS  TO  THE  RWGRID  SUBPROGRAM 

1.  All  occurrences  of  #IMAX  were  changed  to  61. 

2.  All  occurrences  of  #JMAX  were  changed  to  61. 

3.  All  occurrences  of  #KLEV  were  changed  to  16. 

4.  Added  Call  Dropfile(0). 

5.  Added  open  statement  for  unit  5  card  image  data  "GRIDIN". 

6.  Changed  all  #IGMAX  to  145. 

7.  Changed  all  #JGMAX  to  73. 

8.  Fixed  open  statement  for  opening  "FIXED"  file  from  ’UNFORMATTED’  to 
’FORMATTED’. 

9.  Compiled  with  cft77. 

10.  Linked  with  Idr. 

11.  Executed  code. 

This  code  created  2  files:  CNGM  and  GRID. 

A  prerequisite  to  running  RWGRID  is  a  fixed  field  file  containing  a  global  grid  of  terrain 
heights  and  a  global  grid  of  surface  drag  coefficients.  This  global  grid  has  a  resolution  of  2.5°  x  2.5°. 
After  using  the  fixed  field  file  provided  by  AFGWC  and  receiving  abnormal  results,  STC  switched 
to  a  fixed  field  file  from  the  RLAM  at  PL  This  fixed  field  file  required  new  software  to  read  the 
RLAM  file  and  reformat  it  as  required  by  RWGRID. 

C.  MODIFICATIONS  TO  THE  RWSFCT  SUBPROGRAM 

1.  All  occurrences  of  #IMAX  to  61. 

2.  All  occurrences  of  #JMAX  to  61. 

3.  Added  Call  Dropfile(O). 

4.  Opened  "SSMT’  file  (unformatted). 

5.  Opened  "TSEA"  file  (unformatted). 

6.  Opened  "GRID"  file  (unformatted). 
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7.  Opened  "AVGSFCT"  file  (unformatted). 

8.  Wrote  a  small  program  to  convert  card  images  of  sea  surface  temperatures  and 
12-hr-old  1,000-mb  temperatures  into  an  unformatted  SSMT  file. 

9.  The  6  April  1988,  00  UTC  Julian  hour  (177648)  was  added  as  the  first  card  in  the  sea 
surface  temperature  file  prior  to  running  the  program  to  create  SSMT. 

10.  Fixed  bug  in  an  interpolation  routine,  as  requested  by,  AFGWC. 

11.  Compiled  with  cft77. 

12.  Linked  with  ldr. 

13.  Executed  code. 

This  program  created  two  files:  TSEA  and  VGSFCT. 

D.  MODIFICATIONS  MADE  TO  THE  RWANAL  SUBPROGRAM 

Because  the  first  version  of  RWANAL  had  numerous  errors  in  the  code,  STC  implemented 
the  latest  version  called  RWHR00.  The  following  changes  were  made  to  RWHR00: 

1.  All  occurrences  of  #IMAX  to  61. 

2.  All  occurrences  of  #JMAX  to  61. 

3  All  occurrences  of  #MAXPL  to  12. 

4  All  occurrences  of  #KLEV  to  16. 

5.  All  occurrences  of  #IGMAX  to  145. 

6.  All  occurrences  of  #JGMAX  to  73. 

7.  Added  Call  Dropfile(O). 

8.  Opened  unit  5  "ANALIN"  file  for  card  image  formatted  input. 

9.  Opened  unit  10  "CNGM"  file  for  unformatted  input. 

10  Opened  unit  11  "NEWCNGM"  file  for  unformatted  output. 

11  Opened  unit  12  "DBGS00F'  file  for  unformatted  input. 

12.  Opened  unit  20  "GRID”  file  for  unformatted  input. 

13.  Opened  unit  30  "SSMT1  file  for  unformatted  input. 

14.  Opened  unit  40  "TSEA"  file  for  unformatted  input. 

15.  Opened  unit  50  "FHR00"  file  for  unformatted  output. 

16  Opened  unit  60  "RESTART'  file  for  unformatted  output. 

17.  Opened  unit  71  "PREWAM"  file  for  unformatted  input. 
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18.  Most  of  the  modifications  were  in  Subroutine  GETGSM. 

This  subroutine  reads  in  the  GSM  database  file  created  by  our  GSM  postprocessor  program. 
The  changes  made  to  Subroutine  GETGSM  were: 

a.  Added  five  new  arrays  (udummy,  vdummy,  tdummy,  zdummy  and  rdummy).  Each 
one  of  these  array’s  dimensions  were  changes  to  (145*73). 

b.  The  DATFLD  call  was  commented  in  exchange  for  READ(1 2)  udummy.  There 
was  no  need  to  descale  our  GSM  data  for  any  of  the  fields,  so  the  DO  loop  was 
commented  out  too.  Instead  of  descaling,  the  DO  loop  was  set  by  letting  gsmdat(i) 
=  udummy(i). 

c.  We  repeated  the  procedure  above  (18b)  for  the  v-wind  component,  but  READ(12) 
vdummy  was  used  to  read  the  data. 

d.  Did  same  procedure  above  for  temperatures,  but  READ(12)  tdummy  was  used  to 
read  the  data. 

e.  Did  the  same  procedure  above  for  the  heights,  READ(12)  zdummy  was  used  to 
read  the  data. 

f.  Changed  all  occurrences  of  #IMOIS  to  6. 

g.  Repeated  the  (18b)  procedure  above  for  relative  humidity,  using  READ(12) 
rdummy  to  read  the  data. 

h.  Modified  first  read  of  "SSMT'  file  to  read  the  entire  record  (RJLHR, ARRAY),  not 
just  (RJLHR).  ARRAY  was  basically  a  dummy  array  dimensioned  to  145*73. 

i.  The  GSM  database  Julian  hour  SRCJUL  was  set  equal  to  the  Julian  hour  found  in 
the  SSMT  file  because  no  julian  hour  was  found  in  the  PL/GP  database. 

j.  Compiled  with  cft77. 

k.  Linked  with  Idr. 

l.  Executed  code. 

E.  MODIFICATIONS  MADE  TO  SUBPROGRAM  RWBNDY 

The  following  changes  were  made  to  the  4  January  1990  version  of  AFGWC’s  RWBNDY 
code,  so  it  could  be  executed  on  the  Cray-2  at  the  AFWL. 
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1.  Created  a  CHARACTER*5  variable  to  store  the  name  of  the  GSM  verification  file  that 
will  be  opened. 

2.  Added  Call  Dropfile(O)  as  the  first  executable  statement. 

3.  Opened  unit  5  "BNDYIN"  for  card  image  input. 

4.  Opened  unit  40  "FHROO"  for  unformatted  input. 

5.  Opened  unit  30  "GRID"  for  unformatted  input. 

6.  Opened  unit  20  "NEWCNGM"  for  unformatted  input. 

7.  Opened  unit  50  "BTEND"  for  unformatted  output. 

8.  The  following  code  was  added  after  the  assignment  to  FILOUT: 

IF(FCTIME  .EQ.  6)  THEN 
OFILE  =  ’GSMHR06’ 

ELSE 

WRITE(OFILE,90)  FCTIME 

90  FORMAT(’GSMHR’,I2) 

ENDIF 

WRITE(6,91)  OFILE 

91  FORMAT(’OPENING\A8,’  FOR  OUTPUT) 
OPEN(nLOUT,nLE=OnLE)STATUS=’UNKNOWN’, 

lFORM=’UNFORMATTED’) 

9.  Changed  all  occurrences  of  #BPTS  to  5. 

10.  Changed  all  occurrences  of  #IMAX  to  61. 

11.  Changed  all  occurrences  of  #JMAX  to  61. 

12.  Changed  all  occurrences  of  #KLEV  to  16. 

13.  Changed  all  occurrences  of  #MAXDIM  to  61. 

14.  Changed  all  occurrences  of  #MAXPL  to  12. 

15.  The  following  modifications  were  made  to  Subroutine  GETGSM  to  read  the  GSM 
database. 

a.  Added  INTEGER  FILGSM  and  PARAMETER  (FILGSM  =  19). 

b.  Added  CHARACTER*?  LITRAL 

c.  Added  REAL  UDUMMY(145*73),  VDUMMY(1 45*73),  TDUMMY(145*73), 
ZDUMMY(145*73),  RDUMMY(145*73) 
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d.  Changed  the  data  initialization  of  PEROT  to  have  00  as  its  first  value  not  06 

e.  Added  the  following  code  in  order  to  open  the  proper  GSM  database  file: 

UTRAL  =  ’DBGS’  //  PEROT(FCINDX)  //  ’F 
OPEN(FILGSM,FILE=OTRAL,STATUS=’UNKNOWN’, 
FORM=’UNFORMATTED’) 

WRITE(6,21)  UTRAL 

21  FORMAT(//,’  OPENING  ’,A8,’  FOR  INPUT) 

f.  Commented  out  the  DATFLD  call  and  added  a  READ(12)  udummy,  where 
udummy  was  declared  as  a  real  array  dimensioned  to  145*73.  It  was  unnecessary 
to  descale  the  u-winds,  so  that  code  was  commented  out  and  the  DO  loop  was  set 
by  letting  GSMDAT(I)  =  UDUMMY(I). 

g.  Used  a  similar  procedure  with  the  v-Winds,  but  the  real  array  vdummy  was  declared 
to  be  145*73  and  the  READ(12)  vdummy  statement  obtained  the  data. 

h.  Used  same  procedure  with  the  temperatures,  but  the  real  array  tdummy  was 
declared  to  145*73  and  the  READ(12)  tdummy  obtained  the  data. 

i.  Did  the  same  for  heights, with  the  real  array  zdummy  declared  to  145*73  and  used 
READ(12)  zdummy. 

j.  Did  the  same  procedure  for  relative  humidity  but  with  a  READ(12)  rdummy 
statement,  and  the  real  array  rdummy  was  dimensioned  145*73. 

16.  Compiled  with  cft77. 

17.  Linked  with  Idr. 

18.  Executed  code. 

F.  MODIFICATIONS  MADE  TO  SUBPROGRAM  RWQNGM 

The  following  changes  were  made  to  the  4  January  1990  version  of  RWQNGM  so  it  could 
be  executed  at  the  Phillips  Laboratory  Supercomputer  Center. 

1.  Added  Call  Dropfilc(O)  to  main  routine  as  the  first  executable  statement  in  the 

program. 

2.  Changed  all  occurrences  of  #IMAX  to  61. 

3.  Changed  all  occurrences  of  #JMAX  to  61. 

4.  Changed  all  occurrences  of  #MAXD  to  61. 
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5.  Changed  all  occurrences  of  #KLEV  to  16. 

6.  Changed  all  occurrences  of  #NBND  to  5. 

7.  Added  open  statements  in  Subroutine  AINDSK: 

a.  Opened  unit  5  "QNGMIN",  a  formatted  file  for  card  image  input. 

b.  Opened  unit  24  "PSEUDO",  an  unformatted  file  for  input. 

c.  Opened  unit  26  "GRID",  an  unformatted  file  for  input. 

d.  Opened  unit  29  "QNCNGM",  an  unformatted  file  for  output  from  RWGRID. 

8.  Opened  unit  28  "RESTART",  an  unformatted  file  for  input. 

9.  Opened  unit  20  "BTEND",  an  unformatted  file  for  input. 

10.  Opened  unit  FILFCT  "FHR00",  an  unformatted  file  for  input,  after  the 
FILFCT  =  30  +  (IPRNT  -IRPT)  statement;  so  that  FILFCT  would  be  defined  prior 
to  opening. 

11.  Created  in  Subroutine  ARUN  a  character's  variable  called  OFILE;,  then  after  the 
declaration  of  FILFCT,  added  the  following  code  to  create  unique  filenames  for  each 
forecast  hour: 

IF(ITIME  .LT.  10)  THEN 
WRITE  (OFILE, 62)  IT7ME 

62  FORMAT  (’FHRO’,11) 

ELSE 

WRITE  (OFILE, 63)  OFILE 

63  FORMAT  (’FHR’,12) 

ENDIF 

WRITE  (6,64)  OFILE, ITIME 

64  FORMAT  (’****  OPENING  FORECAST  FILE’,A8,’ 

1  FOR  ’,12,’  Hr  FORECAST  ’) 

OPEN  (FILFCT,FILE= OFILE, STATUS =’NEW\  FORM= ’UNFORMATTED’) 

12.  Compiled  with  cft77. 

13.  Linked  with  Idr. 

14.  Executed  code. 
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APPENDIX  B 


COMPARISON  OF  THE  RWFM  AND  GSM  WITH  HIRAS 


As  part  of  the  regional  analysis  procedure  (RAP)  quality  control  program,  Science  and 
Technology  Corporation  completed  careful  objective  and  subjective  analyses  of  the  36-hr  forecasts 
by  the  Air  Force  Global  Weather  Central  (AFGWC)  Global  Spectral  Model  (GSM)  and  the 
Relocatable  Window  Forecast  Model  (RWFM).  In  two  windows,  one  including  Eurasia  and  the 
other  Central  America  and  southern  North  America,  12  surface  and  500-mb  forecasts  at  60-hr 
intervals  in  July  1988  and  January  1989  were  chosen  for  analysis.  Part  I  is  a  discussion  the  subjective 
analysis,  which  is  based  mostly  on  root-mean-square  (rms)  errors,  and  Part  II  is  a  discussion  of  the 
objective  analyses. 

In  general  the  GSM  is  a  slightly  better  model  but  not  significantly  better.  At  some  locations 
on  some  occasions  the  RWFM  made  better  forecasts;  however,  in  the  final  analysis  STC  could  not 
make  a  case  for  using  the  RWFM. 


PARTI.  A  COMPARISON  OF  RWFM  AND  GSM  ERRORS 

The  GSM  36-hr  Eurasian  January  mass  field  forecasts  provide  consistently  smaller  rms  errors 
than  the  RWFM.  The  GSM  temperature  forecasts  have  rms  errors  between  0.5  and  1.5°  K  smaller 
at  all  levels  for  all  cases.  The  GSM  height  forecast  rms  errors  range  from  10  to  50  m  smaller  than 
the  RWFM  between  the  1,000-  and  500-mb  levels  and  50  to  100  m  smaller  between  the  100-  and 
50-mb  levels  for  all  cases.  The  GSM  sea  level  pressure  rms  errors  are  1  to  3  mb  smaller  for  all  cases 
except  00  UTC  1-6-89  (0.3  mb  larger),  00  UTC  1-11-89  (0.3  mb  larger),  and  00  UTC  1-26-89  (1.0 
mb  larger).  The  GSM  relative  humidity  rms  errors  are  generally  5  to  10  percent  smaller  at  most 
levels  for  all  cases.  The  RWFM  and  GSM  wind  forecast  rms  errors  are  of  similar  quality,  especially 
the  v  component  of  the  wind  vector.  (From  here  on,  u  component  refers  to  the  component  of  the 
west-to-cast  wind  vector,  and  similarly  the  v  component  refers  to  the  south-to-north  component  of 
the  wind  vector.)  The  GSM  u-component  rms  errors  are  0.5  to  1.5  m/s  smaller  at  most  levels  in 
most  cases,  but  the  RWFM  had  v-component  rms  errors  at  least  equal  to  those  of  the  GSM  in  9  of 
the  12  cases. 
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The  GSM  36  hr  Eurasian  July  forecasts  provide  height  rms  errors  which  are  5  to  10  m 
smaller  at  most  levels  in  all  cases.  The  differences  between  the  RWFM  and  GSM  temperature  rms 
errors  are  very  small  except  at  the  70-  and  50-mb  levels  where  the  RWFM  rms  errors  are  0.5  to 
1.0  K  smaller  in  every  case.  The  GSM  sea  level  pressure  rms  errors  are  about  1  mb  smaller  than 
the  RWFM  in  every  case  except  12  UTC  1-3-89  (0.1  mb  smaller).  The  GSM  relative  humidity  rms 
errors  are  1  to  5  percent  smaller  at  most  levels  in  most  of  the  cases.  The  RWFM  u  component 
and  v  component  rms  errors  are  0.5  - 1.5  m/s  smaller  than  the  GSM  at  almost  all  levels  for  all  the 
cases. 


The  RWFM  and  GSM  36  hr  Central  American  January  rms  errors  of  the  mass  field  forecast 
are  similar.  The  RWFM  and  GSM  height  and  temperature  rms  errors  are  similar  up  to  the  100-mb 
level,  above  which  the  RWFM  errors  are  about  10  m  and  0.5  K  smaller  for  almost  all  the  cases.  The 
GSM  sea  level  pressure  rms  errors  are  0.2  -  0.5  mb  smaller  than  the  RWFM  for  all  cases  except  12 
UTC  1-3-89  (1.3  mb  smaller)  and  00  UTC  1-26-89  (0.8  mb  smaller).  The  RWFM  and  GSM  relative 
humidity  rms  errors  are  within  1  to  3  percent  of  each  other  at  all  levels  for  almost  all  the  cases.  The 
RWFM  and  GSM  momentum  field  rms  errors  are  similar  except  between  500  and  200  mb,  where 
the  GSM  u  component  and  v  component  rms  errors  are  0.5  -  2.0  m/s  smaller  than  the  RWFM  for 
almost  all  the  cases. 

The  RWFM  36  hr  Central  American  July  height,  temperature,  and  relative  humidity  forecasts 
provide  rms  errors  that  are  equal  to  or  smaller  than  the  GSM  at  most  levels  in  every  case.  The 
RWFM  temperature  rms  errors  are  0.5  to  1.5  K  smaller  than  the  GSM  between  100  and  50  mb  in 
every  case.  The  RWFM  height  rms  errors  are  5  to  10  m  smaller  than  the  GSM  between  300  and 
50  mb  in  most  cases.  The  GSM  sea  level  pressure  rms  errors  are  0.1  to  0.3  mb  lower  than  the 
RWFM  in  all  cases  except  00  UTC  7-1-8 8  (equal),  00  UTC  7-6-88  (0.1  mb  larger).  00  UTC  7-11-88 
(0.4  mb  larger),  and  00  UTC  7-26-88  (equal).  The  RWFM  and  GSM  relative  humidity  rms  errors 
arc  within  1  to  3  percent  at  all  levels  for  every  case.  The  RWFM  u  component  and  v  component 
rms  errors  are  0.5  to  1.0  m/s  smaller  than  those  of  the  GSM  in  most  cases,  especially  between  1,000 
and  200  mb. 
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PART  II.  AN  OBJECTIVE  COMPARISON  OF  THE  36-HR  FORECASTS  BY 
THE  RVVFM  AND  GSM  TO  THE  VERIFYING  ANALYSES  BY  THE 
HIGH  RESOLUTION  ANALYSIS  SYSTEM  (H1RAS)  MODEL 


00  UTC  1-1:  The  RWFM  and  GSM  failed  to  dig  a  strong  short  wave  trough  and  to  intensify 
the  associated  959  mb  surface  low  north  of  Norway  by  9  and  1 1  mb,  respectively.  The  RWFM 
underdeveloped  the  western  Eurasia  surface  high  by  12  mb,  while  the  GSM  underdeveloped  it  by 
5  mb. 


12  UTC  1-3:  The  RWFM  failed  to  develop  the  cutoff  low  in  the  northeastern  portion  of  the 
long  wave  trough  over  the  Soviet  Union.  Both  the  RWFM  and  GSM  failed  to  develop  the  closed 
1,000-mb  surface  low  southward  towards  the  Caspian  Sea;  the  RWFM  forecast  a  987-mb  low  600 
miles  to  the  northwest  where  HIRAS  had  analyzed  a  1,012-mb  high,  and  the  GSM  forecast  a  994-mb 
low  about  200  miles  northwest  of  where  HIRAS  had  analyzed  the  low.  The  RWFM  underdeveloped 
the  1,038-mb  surface  high  over  Greece  by  10  mb. 

00  UTC  1-6:  The  GSM  intensity  and  position  forecasts  of  the  988  mb  Iceland  and  982  mb 
northern  Soviet  Union  surface  lows  were  much  better  than  the  RWFM,  which  underforecast  these 
lows  by  4  and  6  mb,  respectively.  The  RWFM  and  GSM  underforecast  the  500  mb  cutoff  low  over 
the  Barents  Sea  by  140  and  80  m,  respectively.  The  RWFM  underintensifies  the  associated  982  mb 
surface  low  by  6  mb  while  the  GSM  intensity  matched  the  HIRAS. 

12  UTC  1-8:  The  RWFM  overintensifies  the  980  mb  Scandinavian  surface  low  by  7  mb  while 
the  GSM  matched  the  HIRAS  intensity.  Both  the  RWFM  and  GSM  underdeveloped  the  1,040-mb 
surface  high  over  the  eastern  Soviet  Union  by  8  and  10  mb,  respectively.  The  RWFM  and  GSM 
overintensified  The  1,028-mb  trough  cast  of  the  Caspian  Sea  by  10  and  9  mb,  respectively. 

00  UTC  1-11:  The  GSM  overintensified  the  northern  Soviet  Union  988-mb  surface  low  by 
7  mb.  Both  the  RWFM  and  GSM  overintensified  the  968  mb  Iceland  surface  low  by  7  mb  and 
underdeveloped  the  1,034-mb  Yugoslavian  surface  high  by  4  and  5  mb,  respectively.  The  RWFM 
underforecast  the  depth  of  the  500  mb  Barents  Sea  trough  by  60  m  while  the  GSM  matched  the 
HIRAS  intensity. 
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12  UTC  1-13:  Both  models  failed  to  dig  the  strong  500  mb  short  wave  trough  into  Turkey, 
where  heights  were  forecast  140  m  too  high.  Both  the  RWFM  and  GSM  underforecast  the 
associated  1,018  mb  surface  low  over  the  eastern  Mediterranean  Sea  by  3  and  6  mb,  respectively. 
Both  models  failed  to  cut  off  the  500-mb  low  over  the  northern  Soviet  Union,  although  the  GSM 
deepened  the  trough  more  than  the  RWFM.  The  RWFM  and  GSM  underdeveloped  the  1,026  mb 
surface  high  over  the  northern  Soviet  Union  by  7  and  5  mb,  respectively.  The  RWFM 
underdeveloped  the  1,040  mb  western  Europe  surface  high  by  4  mb.  The  RWFM  underforecast  the 
955  mb  Iceland  surface  low  by  7  mb  while  the  GSM  overintensified  it  by  only  1  mb. 

00  UTC  1-16:  The  RWFM  underdeveloped  the  500-mb  high  and  its  associated  1,034  mb 

surface  high  over  the  Mediterranean  Sea  by  80  m  and  6  mb,  respectively.  Both  the  RWFM  and 
GSM  failed  to  intensify  the  500  mb  short  wave  trough  over  Finland  and  incorrectly  phased  it  with 
Caspian  Sea  trough.  The  RWFM  underforecast  the  Finland  976  mb  surface  low  by  only  1  mb,  but 
the  low  is  elongated  towards  the  southeast  and  not  as  circular  as  the  HIRAS  low,  due  to  incorrect 
phasing  of  the  system  with  the  Caspian  Sea  low.  The  GSM  underforecast  this  system  by  7  mb  and 
suffered  the  same  phasing  problem  as  the  RWFM. 

12  UTC  1-18:  The  RWFM  deepened  the  500-mb  low  too  much  over  the  southern  Soviet 

Union  by  60  m  and  overintensified  the  associated  1,005  mb  surface  low  by  13  mb.  The  GSM 

correctly  maintained  the  positive  tilt  and  intensity  of  the  southern  Soviet  Union  500-mb  trough  but 
overintensified  the  surface  low  by  7  mb.  Both  the  RWFM  and  GSM  overintensified  the  972  mb 
surface  low  over  the  Arctic  Ocean  by  14  and  16  mb,  respectively,  but  the  forecast  position  of  the 
GSM  is  closer  to  HIRAS  than  that  of  the  RWFM.  The  RWFM  underdevelops  the  1,037  mb  surface 
high  over  southern  Europe  by  9  mb.  Both  the  RWFM  and  GSM  underdeveloped  the  1,040  mb 
surface  high  over  Siberia  by  11  and  16  mb,  respectively. 

00  UTC  1-21:  The  RWFM  overintensified  the  500-mb  low  and  its  associated  1,016  mb 
surface  low  over  Iraq  by  50  m  and  7  mb,  respectively.  The  RWFM  underdeveloped  the  trough  over 
the  Soviet  Union  by  60  m. 

12  UTC  1-23:  The  RWFM  and  GSM  undcrforecast  the  Greenland  500-mb  trough  by 
150  m.  The  GSM  failed  to  develop  the  500  mb  short  wave  ridge  and  the  associated  1,036  mb  surface 
high  over  Siberia.  The  RWFM  and  GSM  incorrectly  developed  individual  500  mb  cutoff  lows  instead 
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of  splitting  the  500-mb  trough  west  of  the  Siberian  high  into  an  open  short  wave  trough  and  a  cutoff 
low.  The  RWFM  and  GSM  placed  the  associated  992  mb  surface  low  about  500  miles  northwest 
of  the  HIRAS  analyzed  position  with  intensities  of  989  and  993  mb,  respectively.  The  RWFM  and 
GSM  placed  the  1,038  mb  European  surface  high  about  500  miles  west  of  the  HIRAS  position  with 
intensities  of  1,037  and  1,035  mb,  respectively.  The  RWFM  underforecast  the  975  mb  Arctic  Ocean 
surface  low  by  10  mb  and  placed  it  about  300  miles  south  of  the  position  analyzed  by  HIRAS.  The 
GSM  underforecast  this  low  by  only  3  mb  and  placed  it  very  close  to  the  HIRAS  position. 

00  UTC  1-26:  The  RWFM  underforecast  the  Soviet  Union  trough  by  60  m.  The  GSM 
overintensified  the  1,008  mb  surface  low  over  northern  Siberia  by  15  mb  and  the  977  mb  surface  low 
over  the  Barents  Sea  by  6  mb,  while  the  intensities  of  these  lows  forecast  by  the  RWFM  are  within 
2  mb  of  HIRAS.  The  RWFM  and  GSM  underdeveloped  the  Eurasian  surface  high  by  3  and  5  mb, 
respectively, and  both  the  RWFM  and  GSM  underdeveloped  the  952  mb  Iceland  low  by  12  mb. 

12  UTC  1-28:  The  RWFM  and  GSM  overintensified  the  955  mb  Arctic  low  by  13  and 
16  mb,  respectively.  The  RWFM  failed  to  build  the  1,045-mb  surface  high  across  Europe. 


CENTRAL  AMERICAN  WINDOW  IN  JANUARY  1989 

00  UTC  1-1:  The  RWFM  and  GSM  failed  to  develop  the  1007  mb  closed  surface  low  off 
the  mid-Atlantic  coast,  as  both  models  forecast  open  troughs.  Both  the  RWFM  and  GSM 
rnderforecast  the  1,006-mb  surface  low  over  South  America  by  5  and  6  mb,  respectively,  with  the 
RWFM  position  400  miles  too  far  south  and  the  GSM  position  200  miles  too  far  east. 

12  UTC  1-3:  The  RWFM  placed  the  500-mb  trough  south  of  Nova  Scotia  300  miles  to  the 
west  of  the  HIRAS-analyzcd  position.  This  placement  could  explain  the  15  mb  undcrintcnsification 
of  the  associated  964  mb  surface  low;  however,  the  RWFM  position  of  the  surface  low  is  very  close 
to  the  HIRAS  position.  The  GSM  forecast  intensity  of  the  surface  low  matched  the  intensity 
analyzed  by  HIRAS,  but  the  low’s  position  is  about  200  miles  west  of  the  HIRAS  position.  The 
RWFM  and  GSM  underforccast  the  500-mb  low  over  Central  America  by  30  and  40  m,  respectively. 
The  RWFM  and  GSM  underdeveloped  the  500-mb  high  over  Mexico  by  50  and  20  m,  respectively. 
The  RWFM  and  GSM  undcrforecast  the  1,005  mb  surface  low  over  South  America  by  7  mb. 
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00  UTC  1-6:  Both  the  RWFM  and  GSM  failed  to  develop  the  weak  500  mb  short  wave 
trough  and  the  closed  1,017  mb  surface  low  off  the  mid-Atlantic  coast,  developing  weak  surface 
troughs  instead.  The  GSM  showed  more  evidence  of  the  weak  500-mb  trough  than  the  RWFM. 

12  UTC  1-8:  The  GSM  closed  off  the  500-mb  high  over  the  Bahamas  but  underdeveloped 
it  by  20  m,  while  the  RWFM  placed  it  too  far  east  without  closing  it  off.  The  RWFM  deepened  the 
500-mb  trough  too  far  into  Mexico  with  a  subsequent  southward  displacement  of  the  surface  high. 

00  UTC  1-11:  The  RWFM  and  GSM  failed  to  develop  the  weak  surface  trough  east  of  the 
Bahamas.  The  RWFM  correctly  developed  the  weak  mid-Atlantic  coastal  trough  while  the  GSM 
failed  to  develop  it. 

12  UTC  1-13:  The  RWFM  underdeveloped  the  1,036  mb  Atlantic  surface  high  by  3  mb  but 
correctly  developed  the  mid-Atlantic  coastal  trough,  which  the  GSM  did  not  develop.  The  RWFM 
underforecast  the  midwestem  500  mb  short  wave  trough.  Both  the  RWFM  and  GSM  underforecast 
the  Central  American  surface  low  by  only  3  mb,  but  they  underdeveloped  the  South  American 
surface  low  by  5  and  7  mb,  respectively. 

00  UTC  1-16:  The  RWFM  underforecast  the  1,004  mb  Great  Lakes  surface  low  by  4  mb. 

12  UTC  1-18:  Both  the  RWFM  and  GSM  underforecast  the  HIRAS-analyzed  500  mb  short 
wave  trough  and  its  associated  996  mb  surface  low  off  the  U.S.  east  coast:  the  GSM  forecast  a 
closed  1,003-mb  low  and  the  RWFM  an  open  1,004-mb  low.  The  RWFM  and  GSM  failed  to 
develop  the  500  mb  cutoff  lows  south  of  Central  America,  forecasting  heights  60  m  high  instead. 
Thc  RWFM  and  GSM  underforecast  the  Central  American  1,005  and  1,004  mb  surface  lows  by 
6  mb  and  South  American  surface  lows  by  4  mb. 

00  UTC  1-21:  The  RWFM  underforccast  the  gulf  coast  500  mb  trough  and  associated 
surface  trough,  resulting  in  an  underforecast  onshore  flow  along  the  southeast  coast.  The  GSM 
forecast  this  system  much  better. 

12  UTC  1-23:  The  RWFM  and  GSM  underforccast  the  500  mb  short  wave  trough  and  the 
associated  1,006  mb  surface  low  off  the  U.S.  east  coast  by  5  and  8  mb,  respectively. 
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00  UTC  1-26:  The  RWFM  and  GSM  failed  to  develop  the  weak  500  mb  short  wave  trough 
over  the  upper  midwestern  U.S. 

12  UTC  1-28:  The  RWFM  and  GSM  underforecast  the  closed  surface  lows  in  Central 
America  and  South  America  by  5  to  7  mb. 


EURASIAN  WINDOW  IN  JULY  1988 

00  UTC  7-1:  Both  the  RWFM  and  GSM  overintensiOed  the  500-mb  trough  over  Turkey  by 
40  m.  The  RWFM  and  GSM  overintensified  the  1,004  mb  surface  low  over  the  northern  Soviet 
Union  by  4  and  6  mb,  respectively. 

12  UTC  7-3:  The  GSM  underforecast  the  Great  Britain  500  mb  cutoff  low  by  60  m  and  the 
associated  996  mb  surface  low  by  5  mb,  but  the  RWFM  forecast  of  this  system  matched  the  HIRAS 
analysis.  Both  forecast  models  underforecast  the  Arctic  Ocean  500  mb  cutoff  low  by  60  m  and  the 
990  mb  Iranian  surface  low  by  14  mb. 

00  UTC  7-6:  The  RWFM  and  GSM  underforecast  the  500-mb  high  over  Sicily  by  70  and  30 
m,  respectively. 

12  UTC  7-8:  HIRAS  is  missing. 

00  UTC  7-11:  The  RWFM  and  GSM  underdeveloped  the  500-mb  high  over  the  Soviet 
Union  by  70  and  40  m,  respectively.  The  RWFM  and  GSM  overintensified  the  1,008-mb  surface 
trough  over  the  Caspian  Sea  by  7  and  4  mb,  respectively.  The  RWFM  and  GSM  overintensified  the 
1,008  mb  Arctic  trough  north  of  the  Soviet  Union  by  8  and  3  mb,  respectively. 

12  UTC  7-13:  The  RWFM  and  GSM  underforecast  the  500-mb  trough  over  the  Caspian  Sea 
by  40  m.  Both  models  underforecast  the  993  mb  surface  low  over  Iran  by  10  mb.  The  RWFM  and 
GSM  overintensified  the  1,008  mb  Arctic  Ocean  low  by  6  and  5  mb,  respectively,  and  the  1,012  mb 
Greenland  surface  low  by  3  and  5  mb,  respectively. 
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00  UTC  7-16:  The  RWFM  and  GSM  underforecast  the  1,004  mb  surface  low  over  the 
northern  Soviet  Union  by  6  and  4  mb,  respectively,  and  forecast  a  1,012  mb  surface  high  over 
Afghanistan,  where  HIRAS  has  a  996  mb  surface  trough. 

12  UTC  7-18:  The  RWFM  and  GSM  underforecast  the  intensity  of  the  dual  500  mb  cutoff 
low  over  northern  Siberia  and  the  Arctic  Ocean  by  90  and  40  m,  respectively.  The  RWFM 
underforecast  the  associated  995  mb  surface  low  by  6  mb,  while  the  GSM  overintensified  it  by  only 
1  mb.  The  RWFM  and  GSM  underforecast  the  996  mb  Iceland  surface  low  by  4  mb  and  the 
1,006  mb  Scandinavian  surface  low  by  8  mb.  Both  models  overdeveloped  the  European  surface  high 
by  4  mb  and  the  southern  Soviet  Union  surface  high  by  10  mb. 

00  UTC  7-21:  The  RWFM  underforecast  the  500  mb  cutoff  low  over  northern  Siberia  by 
120  m  and  the  associated  994  mb  surface  low  by  13  mb.  The  GSM  underforecast  this  500-mb  low 
by  only  20  m  and  the  surface  low  by  only  1  mb,  but  the  GSM  placed  the  surface  low  about  500  miles 
northwest  of  the  HIRAS  position.  The  GSM  overintensified  the  surface  low  over  the  North  Sea  by 
6  mb,  while  the  RWFM  underforecast  it  by  4  mb.  Both  the  RWFM  and  GSM  underdeveloped  the 
500-mb  high  over  Iran  by  100  m  and  underforecast  the  996  mb  surface  low  over  Afghanistan  by 
8  mb. 


12  UTC  7-23:  The  RWFM  and  GSM  underforecast  the  500  mb  cutoff  low  over  northern 
Siberia  by  160  and  80  m,  respectively,  with  the  RWFM  failing  to  cut  it  off.  The  RWFM  and  GSM 
underforecast  the  associated  surface  low  by  8  and  4  mb,  respectively.  The  RWFM  and  GSM 
underforecast  the  984  mb  Icelandic  surface  low  by  7  and  4  mb,  respectively. 

00  UTC  7-26:  The  RWFM  and  GSM  underforecast  the  500-mb  high  over  Afghanistan 
by  100  m.  The  RWFM  and  GSM  underforecast  the  500-mb  low  over  the  Soviet  Union  by  30  and 
50  m,  respectively.  The  RWFM  and  GSM  undcrforecast  the  996-mb  low  over  Afghanistan  by  8  mb. 

12  UTC  7-28:  The  RWFM  and  GSM  underforecast  the  500  mb  cutoff  low  over  northern 
Siberia  by  120  and  20  m,  respectively.  The  RWFM  and  GSM  underforecast  the  500  mb  cutoff  low 
over  the  Soviet  Union  by  60  and  40  m,  respectively.  The  RWFM  and  GSM  undcrforecast  the  Arctic 
Ocean  1,026  mb  surface  high  by  4  and  5  mb,  respectively. 
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CENTRAL  AMERICAN  WINDOW  IN  JULY  1988 


00  UTC  7-1:  The  RWFM  and  GSM  forecast  the  New  England  500  mb  short  wave  trough 
about  200  miles  west  of  the  position  analyzed  by  HIRAS.  The  RWFM  and  GSM  underforecast  this 
trough  by  60  and  30  m,  respectively.  Both  models  underforecast  the  broad  South  American  surface 
low  by  4  to  6  mb. 

12  UTC  7-3:  Both  the  RWFM  and  GSM  underforecast  the  500  mb  cutoff  low  over  South 
America  by  80  m.  The  RWFM  underforecast  the  Central  American  surface  low  by  4  mb.  The 
RWFM  and  GSM  underforecast  the  South  American  closed  surface  low  by  5  and  7  mb,  respectively. 
Both  models  underforccast  the  surface  trough  over  the  Bahamas  by  3  mb. 

00  UTC  7-6:  Both  the  RWFM  and  GSM  failed  to  develop  the  weak  500-mb  trough  off  the 
U.S.  east  coast,  although  the  GSM  hinted  at  more  development  than  the  RWFM.  Both  models 
underdeveloped  the  500  mb  Bermuda  high  by  40  km  and  the  associated  1,032  mb  surface  high  by 
8  mb. 


12  UTC  7-8:  HIRAS  is  missing. 

00  UTC  7-11:  Both  the  RWFM  and  GSM  underforecast  the  South  American  500  mb  cutoff 
low  by  110  m  and  the  associated  1,012  mb  surface  low  by  4  mb.  The  RWFM  underforecast  the 
closed  1,011  mb  surface  low  over  southern  Mexico  by  4  mb,  while  the  GSM  overintensified  this  low 
by  4  mb. 

12  UTC  7-13:  Both  the  RWFM  and  GSM  underforecast  the  broad  area  of  low  500-mb 
heights  south  of  10°  north  by  40  to  80  m.  Both  models  underforecast  the  Montana  surface  low  by 
9  mb  and  the  Central  American  surface  low  by  4  mb. 

00  UTC  7-16:  Both  the  RWFM  and  GSM  underforccast  the  1,028  mb  surface  Bermuda  high 
by  4  mb.  Both  models  underforccast  the  relatively  low  500-mb  heights  south  of  10°  north  latitude 
by  about  50  m. 
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12  UTC  7-18:  Both  the  RWFM  and  GSM  failed  to  forecast  the  closed  500-mb  low  over  the 
southern  Gulf  of  Mexico,  missing  the  intensity  by  100  m.  The  RWFM  and  GSM  underforecast  the 
1,032  mb  surface  Bermuda  high  by  7  and  6  mb,  respectively.  Both  models  underforecast 
the  1,011  mb  Central  American  closed  surface  low  by  4  mb  and  the  1,010  mb  Brazilian  closed 
surface  low  by  6  mb. 

00  UTC  7-21:  Both  the  RWFM  and  GSM  underforecast  the  midwestem  U.S.  500-mb  trough 
by  30  m  and  the  Central  American  surface  low  by  5  mb.  Both  models  underforecast  the  surface 
Bermuda  high  by  6  mb. 

12  UTC  7-23:  Both  the  RWFM  and  GSM  underforecast  the  eastern  U.S.  500-mb  trough  by 
30  m.  The  RWFM  failed  to  develop  the  1,015  mb  surface  trough  over  New  England,  while  the  GSM 
underforecast  it  by  4  mb.  Both  models  underforecast  the  1,008  mb  Mexican  closed  surface  low  by 
8  mb  and  the  1,011  mb  Panama  closed  surface  low  by  4  mb. 

00  UTC  7-26:  Both  the  RWFM  and  GSM  underforecast  the  relatively  low  500-mb  heights 
south  of  10°  north  by  40  m.  The  RWFM  and  GSM  underforecast  the  1,011  mb  southern  Mexico 
closed  surface  low  by  5  mb  and  the  1,009  mb  Central  American  closed  surface  low  by  5  and  3  mb, 
respectively. 

12  UTC  7-28:  Both  the  RWFM  and  GSM  underforecast  the  depth  of  the  trough  over  the 
Canadian  Maritimes  trough  by  30  m. 

Overall,  the  GSM  provides  noticeably  smaller  rms  forecast  errors  than  the  RWFM  for  active 
weather  cases  (i.e.,  the  Eurasian  and  Central  American  winter  cases),  while  the  RWFM  provides 
smaller  rms  forecast  errors  for  relatively  inactive  weather  cases  (i.e.,  the  Eurasian  and  Central 
American  summer  cases). 
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APPENDIX  C 


AIREPS 

AFGL 

AFGWC 

ASAP 

ASCII 

CA 

DATSAV2 

ECMWF 

EU 

FSR 

FGGE 

GARP 

GDA 

GSM 

HIRAS 

NMI 

OI 

OM 

01-2 

OSSE 

PL 


LIST  OF  ACRONYMS 


aircraft  reports 

Air  Force  Geophysics  Laboratory  (now  called  PL/GP) 

Air  Force  Global  Weather  Central 

AFGL  Statistical  Analysis  Package 

American  Standard  Code  for  Information  Interchange 

Central  America 

Datasave  2 

European  Center  for  Medium -Range  Weather  Forecasts 
Eurasia 

forward  stepwise  regression 
First  GARP  Global  Program 
Global  Atmospheric  Research  Program 
global  data  assimilation 
Global  Spectral  Model 
High  Resolution  Analysis  System 
nonlinear  mode  initialization 
optimum  interpolation 

OI  analysis  prepared  using  the  20  closest  observations 

OI  analysis  prepared  using  a  subset  of  the  observations  (chosen  by  FSR)  used  in  OI- 1 
Observing  System  Simulation  Experiment 
Phillips  Laboratory 
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PL/GP 

PLSC 

RAOB 

RAP 

RLAM 

rms 

RWAM 

RWFM 

RWFMVER 

SOAR 

STC 

UGDF 

UTC 


Phillips  Laboratory  Geophysics  Directorate 
Phillips  Laboratory  Supercomputer  Center 
rawinsonde  observation 
Regional  Analysis  Procedure 
Relocatable  Limited  Area  Model 
root  mean  square 

Relocatable  Window  Analysis  Model 

Relocatable  Window  Forecast  Model 

Relocatable  Window  Forecast  Model  Verification 

second  order  autoregressive 

Science  and  Technology  Corporation 

uniform  gridded  data  field 

Universal  Time  Constant 


* 
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